30 open-source projects similar to pegjs/pegjs, ranked by how many features they have in common. Compare stars, activity and what each one does to find the best Pegjs alternative.
Pest is a Rust parsing library and automatic parser generator that transforms formal grammar definitions into functional parsers. It specializes in Parsing Expression Grammar to recognize and structure complex text patterns, providing a system for context-free grammar parsing. The library implements zero-copy tokenization and static grammar compilation to reduce runtime overhead. It supports no-std runtime compatibility, allowing the parser to be compiled for embedded or bare-metal environments where a standard library is unavailable. The project covers a range of parsing capabilities, inclu
ANTLR is a grammar-based code generator and multi-language parser generator used to design and implement custom languages. It functions as a toolkit for transforming formal language definitions into executable source code for processing structured text or binary files, while providing a framework for automatically constructing and traversing hierarchical parse trees. The project is distinguished by its ability to generate lexers and parsers in various target programming languages from a single shared grammar definition. It supports grammars containing direct left recursion and utilizes adapti
Jison is a JavaScript parser generator that implements the LALR parsing algorithm. It creates tools to analyze custom programming languages by converting structured input into tokens and trees. The project functions as a Bison-compatible generator, accepting grammars in a format compatible with the Bison parser generator to produce JavaScript parsers. It covers the requirements for compiler frontend development, including the implementation of domain-specific languages and syntax analysis tooling. Its capabilities extend to custom language parsing and the generation of parsers via a command
Ohm is a formal grammar parser generator and domain-specific language framework. It provides a system for defining custom languages to parse, validate, and extract data from input text, transforming raw strings into hierarchical abstract syntax trees based on specified formal rules. The project utilizes an Earley parsing algorithm, which allows it to support all context-free grammars, including those with left recursion and ambiguity, without requiring predefined operator precedence. It also includes a dedicated debugging toolkit for tracing and visualizing the step-by-step state transitions
Bhai-lang is a TypeScript-based toy programming language and custom syntax interpreter. It functions as an educational language implementation designed to demonstrate core concepts of variable management, conditional logic, and execution flow. The project provides a custom command line interface and an interactive code playground for writing and testing scripts. It serves as a framework for programming language prototyping, allowing for the definition of custom syntax and execution logic. The system covers the full interpreter pipeline, including lexical analysis, recursive descent parsing,
Racket is a general-purpose, multi-paradigm programming language in the Lisp family designed for language creation. It functions as a language workbench, providing a platform for designing and implementing custom programming languages through a flexible system of macros and modules. The system distinguishes itself by offering a comprehensive suite for semantics engineering, allowing for the construction of specialized language subsets and educational layers. It includes tools for custom language design, such as lexer and parser generation, as well as the ability to define module expansion rul
Sweet-core is a JavaScript source-to-source compiler and Lisp-style macro system. It functions as a syntax transformer that extends JavaScript by allowing the definition of custom syntax and operators during the compilation process. The system provides a framework for building domain-specific languages through hygienic, recursive macro expansion and the creation of new language constructs. It distinguishes itself by supporting custom operator definitions with configurable associativity and precedence to control expression evaluation. The compiler includes a specialized module system for mana
Janet is a Lisp-based dynamic programming language featuring a register-based bytecode virtual machine and an embeddable scripting engine. It functions as a fiber-based concurrency runtime and includes a parsing engine based on Parsing Expression Grammars. The project is distinguished by its ability to be integrated into C or C++ applications via a minimal header interface. It utilizes a Lisp-style macro system for compile-time code transformation and employs prototype-based table inheritance for object-oriented behavior. The runtime covers a broad set of capabilities, including asynchronous
Chumsky is a parser combinator library used to build high-performance parsers by composing small parsing functions into complex grammars. It provides multiple parsing engines, including recursive descent and precedence-climbing implementations for resolving the order of operations in mathematical and logical expressions. The library is distinguished by its zero-copy text parsing, which minimizes memory allocations to increase throughput, and its ability to run without a standard library for use in embedded or resource-constrained environments. It also features an error-recovering parser that
nom is a parser combinator framework for Rust used to build complex parsers by combining small, reusable parsing functions. It functions as a zero-copy parsing tool that minimizes memory overhead by returning slices of the original input instead of allocating new memory. The framework is designed for diverse data formats, serving as a binary data parser with configurable endianness and a bitstream processing library capable of extracting values of arbitrary bit length. It also functions as a streaming data parser that can process data arriving in chunks and signal when additional input is req
js-yaml is a JavaScript library providing a programmatic interface for parsing and dumping YAML data. It functions as a parser and serializer that converts YAML strings into JavaScript objects and transforms JavaScript objects back into YAML format. The library includes a command-line interface for parsing or dumping YAML data via direct input or data pipes. It also features an abstract syntax tree transformer to modify the structure of data during serialization. The project provides capabilities for multi-document processing and the definition of custom schemas and tags to handle specialize
TrumpScript is a Python-based domain specific language and compiler extension that wraps the Python runtime to enforce custom grammar and vocabulary rules. It transforms a specialized, case-insensitive vocabulary and natural speech patterns into executable Python instructions. The implementation distinguishes itself through strict constraints on source code, including a variable name system that restricts identifiers to a predefined whitelist and a numeric parser that rejects any integer not exceeding one million. It further utilizes a token-filtering preprocessor to remove filler words and n
Railroad-diagrams is a utility for generating visual representations of formal grammars and language structures. It functions as a library that transforms dense notation systems, such as Backus-Naur Form or regular expressions, into readable flowcharts. The tool utilizes a coordinate-based layout engine and recursive component composition to construct diagrams as hierarchical trees. By separating geometric calculation logic from the output layer, it supports rendering through Scalable Vector Graphics or Unicode text, ensuring diagrams remain clear and scalable across different environments.
DBML is a domain-specific language and schema definition language used for documenting database architecture and design. It provides a human-readable text format for defining database tables, columns, and relationships in a standardized way. The project functions as a relational schema parser and SQL schema generator. It transforms declarative design specifications into an abstract syntax tree for programmatic manipulation and converts these definitions into executable SQL statements across various database dialects. The system covers relational data modeling, database schema design, and arc
InfluxDB is a specialized time series database platform engineered for the high-speed ingestion, compression, and retrieval of timestamped data at scale. It functions as a distributed metrics platform, providing the infrastructure necessary to organize and analyze massive volumes of time-stamped information to identify trends, patterns, and anomalies within complex data streams. The platform distinguishes itself through a functional dataflow engine that utilizes a specialized programming language for complex analytical transformations and automated tasks. This architecture is supported by a p
Taichi is a domain-specific programming language embedded in Python designed for high-performance numerical computing and computer graphics. It functions as a parallel compiler that translates high-level mathematical expressions into optimized machine instructions, enabling developers to write compute-intensive algorithms that execute across diverse hardware architectures, including CPUs, GPUs, and specialized accelerators. The project distinguishes itself through a hardware-agnostic execution layer that maps parallel operations to multiple backends such as CUDA, Metal, and Vulkan. By utilizi
PHP-Parser is a tool that converts PHP source code into an abstract syntax tree for static analysis and programmatic manipulation. It functions as a parser, a code generator, and a static analysis framework. The project enables the programmatic construction of abstract syntax tree nodes through a fluent interface and provides the ability to transform these trees back into formatted source code. It includes a serializer that exports abstract syntax trees to JSON format and reconstructs them from strings. The toolset covers several capability areas, including namespace resolution, constant exp
Alda is a text-based music programming language and command-line tool for composing, playing, and live-coding musical scores. It functions as a MIDI composition engine that renders plain-text scores into audio output, and as a live coding environment where code entered in a read-eval-print loop produces real-time playback without restarting the interpreter. The system distinguishes itself through an event-driven playback engine that schedules timed note events, an instrument-attribute inheritance model that cascades properties like volume and tempo from global defaults to individual parts, an
UglifyJS2 is a suite of tools designed for parsing, beautifying, mangling, and minifying JavaScript code. It functions by converting source code into an abstract syntax tree to enable programmatic analysis and transformation, and it includes a dedicated generator for creating associated source maps. The project optimizes web production builds by compressing script logic and removing unreachable code. It utilizes name mangling to shorten variable and property names and implements a beautifier to reconstruct compressed scripts into a human-readable layout. The toolset covers broad capability a
This is an educational tutorial that walks through implementing a complete JSON library from scratch in C. The project covers the full data lifecycle of JSON, including parsing text into structured in-memory representations, validating input against the specification, serializing data back into standard JSON output, and providing structured access to elements within parsed arrays and objects. The implementation is built around a hand-written recursive descent parser that processes JSON text by matching grammar rules to build a structured data tree. Parsed values are stored in a tagged union r
Groovy is a JVM programming language and metaprogramming framework that provides a Java compatible environment for building applications. It functions as a dynamic scripting language and a tool for authoring domain-specific languages, allowing for the execution of custom scripts and the creation of specialized mini-languages with concise syntax. The project is distinguished by its ability to modify program behavior and class definitions through both compile-time and runtime metaprogramming. It utilizes a hybrid typing model that combines dynamic method resolution with optional static type che
Mint is a front-end programming language and compiled web framework designed for building interactive user interfaces. It functions as a transpiler that converts a specialized domain-specific language into standard JavaScript and CSS for execution in a web browser. The toolchain enables type-safe UI development by utilizing static type analysis to validate data structures during the build phase. It organizes web interface logic into a component-based architecture, where encapsulated units synchronize internal state with the rendered view. The system covers a full compiled web toolchain, incl
syn is a Rust syntax tree parser and token stream converter. It serves as a toolkit for procedural macro development, providing a framework to parse Rust source code into structured syntax trees for analysis and transformation. The project enables the manipulation of Rust abstract syntax trees through specialized visitor and folder patterns for traversing and mutating nodes. It provides a bidirectional mapping that allows developers to convert token streams into structured trees and print those trees back into tokens for code generation. The library covers a broad range of syntax analysis ca
This project is a C language interpreter and a practical implementation of a programming language. It parses and executes C source code directly, removing the requirement for a separate compilation step. The interpreter is designed for self-hosting, meaning it is capable of interpreting its own source code to demonstrate recursive language processing and execution. The system covers the primary stages of language processing, including lexical analysis, recursive descent parsing, and tree-walk interpretation using an abstract syntax tree. It manages memory and scope through a dynamic symbol t
Emmet is a markup code generator and web development productivity toolkit. It serves as an expansion engine that converts shorthand abbreviations and CSS-like selectors into full HTML, XML, and other markup structures. The project features a dedicated CSS shorthand expansion engine that transforms concise property codes into full style declarations, including the automatic generation of vendor prefixes and gradients. It distinguishes itself by offering a programmable expansion process through custom snippet definitions, abbreviation alias mapping, and script-based extensibility. The toolkit
yaml-cpp is a C++ library for parsing and emitting YAML 1.2 documents. It provides a complete YAML processing pipeline, from reading YAML content into a traversable node tree to writing in-memory data structures back as YAML text. The library represents parsed YAML as a mutable tree of typed nodes, supporting scalars, sequences, maps, and aliases. It uses a recursive-descent parser to build this node tree, and a stream-based emitter to generate YAML output incrementally. Template-based type conversion enables compile-time serialization between YAML nodes and C++ types, including support for c
Wuffs is a toolset for generating memory-safe, sandboxed parsers and decoders from domain-specific language specifications. It functions as a compiler that transforms these specifications into executable code for C, Go, or Rust, specifically designed to decode untrusted file formats while preventing buffer and integer overflows. The project employs a sandboxed execution model that prohibits system calls and manual memory management to ensure computations are side-effect free. It utilizes a refined type system and compile-time constraint verification to enforce memory safety, alongside saturat
This project is an educational compiler implementation and architecture demo. It serves as a small-scale C-style language compiler designed to demonstrate the fundamental stages of transforming source code into executable machine instructions. The codebase functions as a tool for compiler architecture education and design prototyping. It illustrates the process of building an educational language implementation to help users understand the mechanics of parsing and code generation. The implementation covers the primary stages of a compiler pipeline, including regular expression tokenization,
KaTeX is a typesetting library and web math renderer that transforms TeX and LaTeX mathematical notation into high-quality HTML and CSS for web browsers. It functions as a math notation parser and LaTeX to HTML converter, capable of operating as both a client-side library and a server-side math renderer to generate static HTML expressions. The project supports a wide range of specialized mathematical rendering, including chemical equation rendering, Bra-ket notation for quantum mechanics, and mathematical logic typesetting. It provides comprehensive controls for structural layouts such as mat
Streem is a stream-based programming language and data pipeline orchestrator. It provides a domain-specific language for defining concurrent data flows, allowing users to link data sources to destinations through a sequence of operations that transform and filter individual stream elements. The system uses a custom script syntax to define data-flow connections and pipeline definitions. This allows for the orchestration of concurrent data processing where multiple pipeline stages execute simultaneously to move data elements through the system. The platform covers functional data transformatio