Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
429672e
Move buffer out of utils
kddnewton Mar 17, 2026
bda0f88
Split up buffer headers between internal and external
kddnewton Mar 17, 2026
1039dff
Only include what is necessary for buffer
kddnewton Mar 17, 2026
e4df041
Move strncasecmp out of utils and make internal only
kddnewton Mar 17, 2026
cdf8bed
Make memchr implementation internal
kddnewton Mar 17, 2026
0862b48
Remove unnecessary PRISM_EXPORTED_FUNCTION from source files
kddnewton Mar 17, 2026
cdc9b97
Move integer out of utils
kddnewton Mar 17, 2026
16fd2bf
Split up integer between public and private headers
kddnewton Mar 17, 2026
5966ab6
Move line offset list out of util
kddnewton Mar 17, 2026
5858690
Split up line offset list into public and internal
kddnewton Mar 17, 2026
895b395
Move char to internal headers
kddnewton Mar 17, 2026
c74b7c3
Move arena out of utils
kddnewton Mar 17, 2026
c03b69b
Split up arena headers into public and internal
kddnewton Mar 17, 2026
a18e3ca
Move compiler macro definitions into include/prism/attribute
kddnewton Mar 17, 2026
104b6ab
Move strpbrk into internal
kddnewton Mar 17, 2026
af68332
Split up diagnostic headers into public and internal
kddnewton Mar 17, 2026
cc97110
Move constant pool out of utils
kddnewton Mar 17, 2026
281244f
Split up constant pool headers between public and internal
kddnewton Mar 17, 2026
b27fd82
Move strings out of util
kddnewton Mar 17, 2026
60e105f
Split up public and internal strings headers
kddnewton Mar 17, 2026
c661045
Move list out of utils
kddnewton Mar 17, 2026
4149565
Fully remove util dir
kddnewton Mar 17, 2026
2cd264f
Split up list public and internal headers
kddnewton Mar 17, 2026
f9f9cd5
Split up encoding public and internal headers
kddnewton Mar 17, 2026
d2ec362
Split up static literals public and internal
kddnewton Mar 17, 2026
abd0a83
Split up public/internal options
kddnewton Mar 17, 2026
cc93903
Put inline in its own header
kddnewton Mar 17, 2026
8d1df78
Move regexp to internal
kddnewton Mar 17, 2026
2496b01
Split out excludes into its own header
kddnewton Mar 17, 2026
4090539
Trim down parser.h
kddnewton Mar 17, 2026
7690458
Trim down node.h
kddnewton Mar 17, 2026
8ad8802
More splitting of headers
kddnewton Mar 17, 2026
400b217
Remove defines
kddnewton Mar 17, 2026
01d575a
Move compiler detection stuff into include/prism/compiler
kddnewton Mar 17, 2026
1e3ec12
Move allocator to internal headers
kddnewton Mar 17, 2026
323f7f1
Move file system into compiler headers
kddnewton Mar 17, 2026
a5dfba8
Trim down even more of internal header includes
kddnewton Mar 17, 2026
a4a44cb
Split node.h headers
kddnewton Mar 17, 2026
2b52981
pm_buffer_free -> pm_buffer_cleanup
kddnewton Mar 17, 2026
1b594e1
Make buffer an opaque pointer
kddnewton Mar 17, 2026
b1be4b4
pm_parser_free -> pm_parser_cleanup
kddnewton Mar 17, 2026
88a247a
Move string query into its own file
kddnewton Mar 17, 2026
6901e85
Trim down prism.h
kddnewton Mar 17, 2026
92b48ce
pm_string_free -> pm_string_cleanup
kddnewton Mar 17, 2026
0edaefb
pm_options_free -> pm_options_cleanup
kddnewton Mar 17, 2026
20de0e4
Do not return bool from pm_options_scope_init
kddnewton Mar 17, 2026
0167645
Make options fully opaque
kddnewton Mar 17, 2026
d315325
Give full lifetime functions to parser
kddnewton Mar 18, 2026
035061c
Update pm_parse_stream API to make parser opaque
kddnewton Mar 18, 2026
3da08fd
Make parser an opaque pointer
kddnewton Mar 18, 2026
7cb8b59
Move static literals entirely internal
kddnewton Mar 18, 2026
2b84d22
Move some serialize functions internal
kddnewton Mar 18, 2026
10ebcaf
Move encoding entirely internal
kddnewton Mar 18, 2026
da00377
Move some options internal metadata internal
kddnewton Mar 18, 2026
a29aab4
Consistency in naming
kddnewton Mar 18, 2026
0b17e49
Make pm_comment_t opaque
kddnewton Mar 18, 2026
1ddff36
Move comment into its own section
kddnewton Mar 18, 2026
84b08e2
Move more constants internal
kddnewton Mar 18, 2026
6f40516
Move diagnostics entirely internal
kddnewton Mar 18, 2026
f2e8648
Move magic comments entirely internal
kddnewton Mar 18, 2026
bbc2023
Fix up build
kddnewton Mar 18, 2026
e8606f7
Do not define a shim if the define is set
kddnewton Mar 18, 2026
e0d17eb
Remove iterators, just use callbacks
kddnewton Mar 18, 2026
1f4bf83
Code review
kddnewton Mar 18, 2026
2867c52
Move node list append internal
kddnewton Mar 18, 2026
9498322
Fix up diagnostic templates
kddnewton Mar 18, 2026
2fcab5e
Do not inline node_new functions
kddnewton Mar 18, 2026
ee6f320
Fold node_new into node/ast
kddnewton Mar 18, 2026
639da14
Move some of arena internal
kddnewton Mar 18, 2026
93085a5
Make arena fully opaque
kddnewton Mar 18, 2026
250b2c9
Make the constant pool fully opaque
kddnewton Mar 18, 2026
fb0d136
Inline comments and magic comments, they do not need their own TUs
kddnewton Mar 18, 2026
aa84fd5
Cleanup
kddnewton Mar 18, 2026
879139e
pm_parser_init and pm_parser_cleanup -> internal
kddnewton Mar 18, 2026
456167d
Move even more headers into their own spots
kddnewton Mar 18, 2026
3b09886
Move JSON to its own TU
kddnewton Mar 18, 2026
f02d270
Move parse_success_p into serialization functions
kddnewton Mar 18, 2026
d54885e
Naming conventions
kddnewton Mar 18, 2026
06a944a
Make some token logic internal
kddnewton Mar 18, 2026
7dde210
Documentation on public API functions
kddnewton Mar 18, 2026
94d16c6
Clean up documentation
kddnewton Mar 18, 2026
b66fbf9
Clean up rake build
kddnewton Mar 18, 2026
26731cc
Make sure we have at least one declaration in TUs
kddnewton Mar 18, 2026
cab3fd8
Fix up rust side of the build
kddnewton Mar 18, 2026
852bb04
Ensure wasm build is happy
kddnewton Mar 18, 2026
149cc9d
Final review
kddnewton Mar 18, 2026
665bcf3
Rebase
kddnewton Mar 18, 2026
1c1e948
Fix up bindings
kddnewton Mar 18, 2026
717e4e7
Ensure we free options before raising type errors
kddnewton Mar 19, 2026
6ba2c64
Add necessary functions for CRuby integration
kddnewton Mar 19, 2026
d4a3ef9
pm_parser_constant_find
kddnewton Mar 19, 2026
a52c481
Also expose pm_constant_id_list_init, pm_constant_id_list_append, and…
kddnewton Mar 19, 2026
f50c25b
Introduce pm_source_t
kddnewton Mar 19, 2026
603e482
Use xfree_sized everywhere possible
kddnewton Mar 19, 2026
0c6494a
pm_source_owned_new
kddnewton Mar 19, 2026
eb398af
Revert xfree_sized for integer
kddnewton Mar 19, 2026
75eb63e
Fix up gemspec build
kddnewton Mar 19, 2026
b5683c8
Fix up FFI in Ractors reading internal ivar
kddnewton Mar 19, 2026
eb1d518
Rename strings to stringy because of linux conflicts
kddnewton Mar 19, 2026
ba16ae2
Move PRISM_NODISCARD to the correct position
kddnewton Mar 19, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/cpp-bindings.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ jobs:
- name: Compile prism
run: bundle exec rake compile
- name: Compile C++
run: g++ -o ./cpp_test cpp/test.cpp build/static/*.o build/static/util/*.o -Iinclude
run: g++ -o ./cpp_test cpp/test.cpp build/static/*.o -Iinclude
- name: Run C++
run: ./cpp_test
5 changes: 2 additions & 3 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,7 @@ out.svg
/fuzz/output/
/gemfiles/typecheck/bin/
/include/prism/ast.h
/include/prism/diagnostic.h
/include/prism/node_new.h
/include/prism/internal/diagnostic.h
/javascript/node_modules/
/javascript/package-lock.json
/javascript/src/deserialize.js
Expand All @@ -58,7 +57,7 @@ out.svg
/src/node.c
/src/prettyprint.c
/src/serialize.c
/src/token_type.c
/src/tokens.c
/src/**/*.o
/rbi/prism/dsl.rbi
/rbi/prism/node.rbi
Expand Down
4 changes: 2 additions & 2 deletions Doxyfile
Original file line number Diff line number Diff line change
Expand Up @@ -23,8 +23,8 @@ PROJECT_NAME = "Prism Ruby parser"
OUTPUT_DIRECTORY = doc
JAVADOC_AUTOBRIEF = YES
OPTIMIZE_OUTPUT_FOR_C = YES
INPUT = src src/util include include/prism include/prism/util
EXCLUDE = include/prism/debug_allocator.h
INPUT = include/prism.h include/prism
EXCLUDE = include/prism/internal
HTML_OUTPUT = c
SORT_MEMBER_DOCS = NO
GENERATE_LATEX = NO
Expand Down
6 changes: 3 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,12 +70,12 @@ build/fuzz.%: $(SOURCES) fuzz/%.c fuzz/fuzz.c
$(ECHO) "building $* fuzzer"
$(Q) $(MAKEDIRS) $(@D)
$(ECHO) "building main fuzz binary"
$(Q) afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^
$(Q) afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^
$(ECHO) "building cmplog binary"
$(Q) AFL_LLVM_CMPLOG=1 afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@.cmplog $^
$(Q) AFL_LLVM_CMPLOG=1 afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@.cmplog $^

build/fuzz.heisenbug.%: $(SOURCES) fuzz/%.c fuzz/heisenbug.c
$(Q) afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize-ignorelist=fuzz/asan.ignore -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^
$(Q) afl-clang-lto $(DEBUG_FLAGS) $(CPPFLAGS) $(CFLAGS) $(FUZZ_FLAGS) -O0 -fsanitize=fuzzer,address -ggdb3 -std=c99 -Iinclude -o $@ $^

fuzz-debug:
$(ECHO) "entering debug shell"
Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,6 @@ The repository contains the infrastructure for both a shared library (libprism)
│ └── prism Sample code that uses the Ruby API for documentation purposes
├── sig RBS type signatures for the Ruby library
├── src
│   ├── util various utility files
│   └── prism.c main entrypoint for the shared library
├── templates contains ERB templates generated by templates/template.rb
│   └── template.rb generates code from the nodes and tokens configured by config.yml
Expand Down
21 changes: 10 additions & 11 deletions cpp/test.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,20 @@ extern "C" {
#include <iostream>

int main() {
pm_arena_t arena = { 0 };
pm_parser_t parser;
pm_parser_init(&arena, &parser, reinterpret_cast<const uint8_t *>("1 + 2"), 5, NULL);
pm_arena_t *arena = pm_arena_new();
pm_parser_t *parser = pm_parser_new(arena, reinterpret_cast<const uint8_t *>("1 + 2"), 5, NULL);

pm_node_t *root = pm_parse(&parser);
pm_buffer_t buffer = { 0 };
pm_node_t *root = pm_parse(parser);
pm_buffer_t *buffer = pm_buffer_new();

pm_prettyprint(&buffer, &parser, root);
pm_buffer_append_byte(&buffer, '\0');
pm_prettyprint(buffer, parser, root);

std::cout << buffer.value << std::endl;
std::string_view view(pm_buffer_value(buffer), pm_buffer_length(buffer));
std::cout << view << std::endl;

pm_buffer_free(&buffer);
pm_parser_free(&parser);
pm_arena_free(&arena);
pm_buffer_free(buffer);
pm_parser_free(parser);
pm_arena_free(arena);

return 0;
}
2 changes: 1 addition & 1 deletion docs/build_system.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ If you need to use memory allocation functions implemented outside of the standa
* Additionally, include `-I [path/to/custom_allocator]` where your `prism_xallocator.h` is located
* Link the implementation of `prism_xallocator.c` that contains functions declared in `prism_xallocator.h`

For further clarity, refer to `include/prism/defines.h`.
For further clarity, refer to `include/prism/internal/allocator.h`.

### Building prism from source as a C library

Expand Down
4 changes: 2 additions & 2 deletions docs/encoding.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ For each of these encodings, prism provides functions for checking if the subseq

## Getting notified when the encoding changes

You may want to get notified when the encoding changes based on the result of parsing an encoding comment. We use this internally for our `lex` function in order to provide the correct encodings for the tokens that are returned. For that you can register a callback with `pm_parser_register_encoding_changed_callback`. The callback will be called with a pointer to the parser. The encoding can be accessed through `parser->encoding`.
You may want to get notified when the encoding changes based on the result of parsing an encoding comment. We use this internally for our `lex` function in order to provide the correct encodings for the tokens that are returned. For that you can register a callback with `pm_parser_encoding_changed_callback_set`. The callback will be called with a pointer to the parser. The encoding can be accessed through `parser->encoding`.

```c
// When the encoding that is being used to parse the source is changed by prism,
Expand All @@ -117,5 +117,5 @@ typedef void (*pm_encoding_changed_callback_t)(pm_parser_t *parser);
// Register a callback that will be called whenever prism changes the encoding
// it is using to parse based on the magic comment.
PRISM_EXPORTED_FUNCTION void
pm_parser_register_encoding_changed_callback(pm_parser_t *parser, pm_encoding_changed_callback_t callback);
pm_parser_encoding_changed_callback_set(pm_parser_t *parser, pm_encoding_changed_callback_t callback);
```
15 changes: 4 additions & 11 deletions docs/fuzzing.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,34 +5,29 @@ We use fuzzing to test the various entrypoints to the library. The fuzzer we use
```
fuzz
├── corpus
│   ├── parse fuzzing corpus for parsing (a symlink to our fixtures)
│   └── regexp fuzzing corpus for regexp
│   └── parse fuzzing corpus for parsing (a symlink to our fixtures)
├── dict a AFL++ dictionary containing various tokens
├── docker
│   └── Dockerfile for building a container with the fuzzer toolchain
├── fuzz.c generic entrypoint for fuzzing
├── heisenbug.c entrypoint for reproducing a crash or hang
├── parse.c fuzz handler for parsing
├── parse.sh script to run parsing fuzzer
├── regexp.c fuzz handler for regular expression parsing
├── regexp.sh script to run regexp fuzzer
└── tools
   ├── backtrace.sh generates backtrace files for a crash directory
   └── minimize.sh generates minimized crash or hang files
```

## Usage

There are currently three fuzzing targets
There is currently one fuzz target:

- `pm_serialize_parse` (parse)
- `pm_regexp_parse` (regexp)

Respectively, fuzzing can be performed with
Fuzzing can be performed with

```
make fuzz-run-parse
make fuzz-run-regexp
```

To end a fuzzing job, interrupt with CTRL+C. To enter a container with the fuzzing toolchain and debug utilities, run
Expand All @@ -43,8 +38,6 @@ make fuzz-debug

# Out-of-bounds reads

Currently, encoding functionality implementing the `pm_encoding_t` interface can read outside of inputs. For the time being, ASAN instrumentation is disabled for functions from src/enc. See `fuzz/asan.ignore`.

To disable ASAN read instrumentation globally, use the `FUZZ_FLAGS` environment variable e.g.

```
Expand All @@ -55,7 +48,7 @@ Note, that this may make reproducing bugs difficult as they may depend on memory

```
make fuzz-debug # enter the docker container with build tools
make build/fuzz.heisenbug.parse # or .regexp
make build/fuzz.heisenbug.parse
./build/fuzz.heisenbug.parse path-to-problem-input
```

Expand Down
10 changes: 5 additions & 5 deletions docs/serialization.md
Original file line number Diff line number Diff line change
Expand Up @@ -159,8 +159,8 @@ typedef struct {
size_t capacity;
} pm_buffer_t;

// Free the memory associated with the buffer.
void pm_buffer_free(pm_buffer_t *);
// Free the memory held by the buffer.
void pm_buffer_cleanup(pm_buffer_t *);

// Parse and serialize the AST represented by the given source to the given
// buffer.
Expand All @@ -172,12 +172,12 @@ Typically you would use a stack-allocated `pm_buffer_t` and call `pm_serialize_p
```c
void
serialize(const uint8_t *source, size_t length) {
pm_buffer_t buffer = { 0 };
pm_serialize_parse(&buffer, source, length, NULL);
pm_buffer_t *buffer = pm_buffer_new();
pm_serialize_parse(buffer, source, length, NULL);

// Do something with the serialized string.

pm_buffer_free(&buffer);
pm_buffer_free(buffer);
}
```

Expand Down
5 changes: 1 addition & 4 deletions ext/prism/extconf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,7 @@ def add_libprism_source(path)
src_list path
end

$srcs = src_list("$(srcdir)") +
add_libprism_source("$(srcdir)/../../src") +
add_libprism_source("$(srcdir)/../../src/util")

$srcs = src_list("$(srcdir)") + add_libprism_source("$(srcdir)/../../src")
$headers += Dir["#{$srcdir}/../../include/**/*.h"]

# Finally, we'll create the `Makefile` that is going to be used to configure and
Expand Down
Loading
Loading