# google/highway

**Attribution required: if you use, quote, or summarise this content, you must credit and link back to [awesome-repositories.com](https://awesome-repositories.com/repository/google-highway).**

5,644 stars · 445 forks · C++ · NOASSERTION

## Links

- GitHub: https://github.com/google/highway
- awesome-repositories: https://awesome-repositories.com/repository/google-highway.md

## Description

Highway is a portable C++ library and hardware abstraction layer designed for writing single instruction multiple data (SIMD) code. It provides a unified interface that maps data-parallel logic to various CPU instruction sets, enabling the development of high-performance software that runs across different processor architectures without requiring architecture-specific assembly.

The project features a dynamic instruction dispatcher that selects the most efficient CPU instruction set at runtime based on detected hardware. It also supports static target specialization and extensible mechanisms for adding new hardware targets or custom SIMD operations.

The library covers a broad range of vector operations, including element-wise arithmetic, lane reduction, shuffling, and masked conditional execution. It includes a vectorized math library, a memory manager for aligned allocation and masked load-store operations, and primitives for hardware-accelerated cryptography.

Tooling is provided for the automated compilation and validation of hardware-accelerated instructions across multiple processor architectures.

## Tags

### Programming Languages & Runtimes

- [Hardware Dispatchers](https://awesome-repositories.com/f/programming-languages-runtimes/runtime-execution-environments/runtime-environments/runtime-internals-foundations/runtime-architecture/hardware-dispatchers.md) — Features a dynamic dispatcher that detects CPU capabilities at startup to select the most efficient instruction set.
- [Loop Vectorization](https://awesome-repositories.com/f/programming-languages-runtimes/looping-constructs/loop-unrolling-transformations/loop-vectorization.md) — Provides a portable interface to implement loop vectorization using strip-mining strategies across various CPU architectures. ([source](https://github.com/google/highway/blob/master/README.md))
- [Vector Masking and Reduction](https://awesome-repositories.com/f/programming-languages-runtimes/vector-masking-and-reduction.md) — Provides hardware-specific sum and max instructions to aggregate vector elements into scalar values.
- [Vector Element Comparison](https://awesome-repositories.com/f/programming-languages-runtimes/vector-masking-and-reduction/vector-element-comparison.md) — Evaluates equality and relational conditions between vectors to create binary masks for conditional processing. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [Vector Mask Manipulation](https://awesome-repositories.com/f/programming-languages-runtimes/vector-masking-and-reduction/vector-mask-manipulation.md) — Provides tools to create and combine binary masks for conditional execution on specific vector lanes. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))

### Software Engineering & Architecture

- [Hardware Abstraction Layers](https://awesome-repositories.com/f/software-engineering-architecture/hardware-abstraction-layers.md) — Acts as a hardware abstraction layer that maps processor-specific intrinsics into a unified C++ API.
- [Hardware Instruction Targeting](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/computational-efficiency/cpu-optimization-strategies/hardware-instruction-targeting.md) — Executes specialized code paths by targeting specific hardware instructions based on the detected CPU target. ([source](https://google.github.io/highway/en/master/))
- [Vector Instruction Targets](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/computational-efficiency/cpu-optimization-strategies/hardware-instruction-targeting/instruction-set-targets/vector-instruction-targets.md) — Optimizes binaries by resolving specific SIMD instruction set targets like SSE or AVX during compilation.
- [Vector Instruction Mapping](https://awesome-repositories.com/f/software-engineering-architecture/performance-reliability/performance-optimization/computational-efficiency/cpu-optimization-strategies/hardware-instruction-targeting/vector-instruction-mapping.md) — Provides a portable interface that maps data-parallel logic to architecture-specific vector instruction intrinsics. ([source](https://github.com/google/highway/blob/master/meson_options.txt))
- [Compile-Time Metaprogramming](https://awesome-repositories.com/f/software-engineering-architecture/software-architecture/architectural-patterns/abstraction-domain-modeling/compile-time-architectural-patterns/compile-time-metaprogramming.md) — Implements compile-time logic to resolve optimal data types and vector widths across different hardware targets.

### Data & Databases

- [SIMD-Based Data Parallelism](https://awesome-repositories.com/f/data-databases/vectorized-arithmetic/simd-accelerated-arithmetic/simd-based-data-parallelism.md) — Provides a portable interface to write data-parallel code that maps to hardware-accelerated SIMD instructions.
- [High-Performance Data Analysis](https://awesome-repositories.com/f/data-databases/high-performance-data-analysis.md) — Accelerates mathematical operations, sorting, and hashing using optimized low-level vector algorithmic toolkits.
- [Vectorized Memory Access](https://awesome-repositories.com/f/data-databases/vector-data-processing/vectorized-memory-access.md) — Moves data between memory and registers using aligned, unaligned, masked, or interleaved access patterns. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [Masked Memory Operations](https://awesome-repositories.com/f/data-databases/vector-data-processing/vectorized-memory-access/masked-memory-operations.md) — Implements masked load and store operations to prevent out-of-bounds memory access during partial vector processing. ([source](https://github.com/google/highway#readme))
- [Vector Lane Shuffling](https://awesome-repositories.com/f/data-databases/vector-element-manipulations/vector-lane-shuffling.md) — Implements vector lane shuffling including reverses, slides, and broadcasts to rearrange elements within registers. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [Cryptographic Accelerators](https://awesome-repositories.com/f/data-databases/vectorized-arithmetic/simd-accelerated-arithmetic/cryptographic-accelerators.md) — Implements hardware-accelerated cryptographic primitives and carryless multiplication using SIMD vector instructions. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))

### Operating Systems & Systems Programming

- [Cross-Architecture Performance Strategies](https://awesome-repositories.com/f/operating-systems-systems-programming/cross-architecture-performance-strategies.md) — Enables high-performance software execution across multiple processor types without requiring architecture-specific assembly.
- [Instruction Set Targeting](https://awesome-repositories.com/f/operating-systems-systems-programming/instruction-set-targeting.md) — Selects the best available CPU instruction set during program execution to maximize performance on heterogeneous hardware.
- [Runtime Instruction Dispatchers](https://awesome-repositories.com/f/operating-systems-systems-programming/runtime-instruction-dispatchers.md) — Ships a dynamic instruction dispatcher that selects the most efficient CPU instruction set based on detected hardware.
- [SIMD Abstraction Layers](https://awesome-repositories.com/f/operating-systems-systems-programming/simd-abstraction-layers.md) — Provides a portable C++ template interface that maps to architecture-specific SIMD intrinsics at compile time.
- [SIMD Libraries](https://awesome-repositories.com/f/operating-systems-systems-programming/simd-libraries.md) — Provides a portable C++ library for writing SIMD code that maps to various CPU instruction sets.

### Scientific & Mathematical Computing

- [Conditional Execution Masks](https://awesome-repositories.com/f/scientific-mathematical-computing/mask-based-filtering/conditional-execution-masks.md) — Uses binary masks to control vector lane operations and avoid branching in data-parallel code.
- [Vectorized Math Functions](https://awesome-repositories.com/f/scientific-mathematical-computing/hardware-level-vectorization/vectorized-math-functions.md) — Offers a vectorized math library for fast trigonometric, hyperbolic, and complex arithmetic using CPU instructions. ([source](https://github.com/google/highway/blob/master/hwy_tests.bzl))
- [Parallel Math Libraries](https://awesome-repositories.com/f/scientific-mathematical-computing/parallel-math-libraries.md) — Provides a collection of hardware-accelerated functions for performing element-wise and complex math on vectors.
- [Vector Mathematics](https://awesome-repositories.com/f/scientific-mathematical-computing/vector-mathematics.md) — Implements hardware-accelerated dot products and other mathematical functions across various CPU architectures. ([source](https://github.com/google/highway/blob/master/libhwy-contrib.pc.in))
- [Vector Arithmetic Operations](https://awesome-repositories.com/f/scientific-mathematical-computing/vector-operations/vector-arithmetic-operations.md) — Provides hardware-accelerated element-wise arithmetic and fused multiply-add operations across diverse CPU architectures. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [Vector Type Converters](https://awesome-repositories.com/f/scientific-mathematical-computing/vector-type-converters.md) — Provides functions for changing element widths and casting between integer and floating-point representations in vector registers. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [Conditional Vector Operations](https://awesome-repositories.com/f/scientific-mathematical-computing/vectorized-operations/conditional-vector-operations.md) — Implements masked conditional execution to control which vector lanes perform operations without branching. ([source](https://github.com/google/highway/blob/master/g3doc/design_philosophy.md))

### Part of an Awesome List

- [Custom SIMD Operations](https://awesome-repositories.com/f/awesome-lists/devtools/assemblers/simd-implementations/custom-simd-operations.md) — Allows adding new hardware-accelerated functions by writing target-specific implementations for all supported architectures. ([source](https://github.com/google/highway/blob/master/g3doc/impl_details.md))
- [SIMD Memory Alignment](https://awesome-repositories.com/f/awesome-lists/devtools/memory-allocators/simd-memory-alignment.md) — Ensures memory allocations start on specific byte boundaries to maximize CPU cache efficiency and prevent faults. ([source](https://github.com/google/highway/blob/master/g3doc/quick_reference.md))
- [SIMD Memory Managers](https://awesome-repositories.com/f/awesome-lists/devtools/memory-allocators/simd-memory-alignment/simd-memory-managers.md) — Includes a memory manager for aligned allocation and masked load-store operations to optimize vector processing.
- [Parallel Programming Frameworks](https://awesome-repositories.com/f/awesome-lists/devtools/parallel-programming-frameworks.md) — Performance-portable SIMD intrinsics for various architectures.

### DevOps & Infrastructure

- [Hardware Target Extensions](https://awesome-repositories.com/f/devops-infrastructure/multi-architecture-hardware-targeting/hardware-target-extensions.md) — Integrates new processor architectures by defining target identifiers and connecting them to dispatch mechanisms. ([source](https://github.com/google/highway/blob/master/g3doc/impl_details.md))

### Security & Cryptography

- [Hardware-Accelerated Primitive Substitutions](https://awesome-repositories.com/f/security-cryptography/cryptography/cryptographic-primitives/hardware-accelerated-primitive-substitutions.md) — Implements secure hashing and encryption rounds using processor-specific hardware-accelerated primitives.
