Immediately after switching the page, it will work with CSR.
Please reload your browser to see how it works.

Source:https://github.com/SoraKumo001/next-streaming

willvarfar 12 daysReload

I enjoy messing around parsing things etc. Although I started with handmade unschooled attempts many decades ago, I later went through the classic yak/bison phase etc before firmly getting back into the hand-rolling custom side of things where I'm much happier.

My main motivation is speed, e.g. I have enjoyed handcrafting wickedly fast custom JSON and SQL parsers and bits of code that sit on top of them.

My general approach now is to use a tokenizer that generates an int that represents each token, where the bits in the int tell me the location of the token in the source buffer and its type etc.

In languages with a lot of per-object overhead like Java this is literally a long; but in the C/C++/rust camp it can look and feel like an object or struct or whatever because, underneath, it ends up still being an int that gets passed in registers and on the stack etc.

Sometimes the parsing is one-pass and the tokens don't need to be stored or anything; its usually the memory allocations that kill parsing performance, and a once-through json decoder can completely eliminate bottlenecks on hot paths in data processing etc.

Other times I run through once and store these tokens in an array, particularly if I'm going to be going back over them etc. Its actually easy to make a function that, given an 'int' token, finds the next one, so if you are going through the data several times you don't need any allocations. But other times you want to go backwards or you are really going to be going through the data a lot so it makes sense to store the offsets of everything.

Sometimes future steps will be reordering things and transforming the AST etc; in those cases, I generally have a writeable arena where I append the text of new tokens, and a bit in the token ints discriminate between the read-only source buffer and this transformed buffer. This is all particularly cool when it comes to generating sensible error messages with context, which is a finesse most handmade parser makers rue later that they had overlooked :)

I would be interested to know just how unmainstream this kind of approach is? Please weigh in, would love to learn new tricks :)

kragen 12 daysReload

probably worth noting that the default peg backend for the c combinator parser 'hammer' uses an arena allocator for speed and because it's a good match for packrat memoization. there are hammer bindings for several languages including python and java, not sure if there's a rust one yet. hammer also has some other backends including lalr, and in some cases you can write a grammar such that you can parse it with either packrat or lalr

hammer's arena allocator is not very fast, though. it's faster than any malloc i've seen, but a good arena allocator is faster than calling an empty function, so you have to inline the fast path. the open-source prototype arena allocator i wrote in http://canonical.org/~kragen/sw/dev3/kmregion.h (and .c) doesn't quite reach that level, but it takes about 1.9 nanoseconds per allocation (520 million allocations per second on one core), while glibc's malloc takes about 19 nanoseconds per allocation. more recent versions of glibc (on a faster cpu) have reduced that to 14 nanoseconds in that particular microbenchmark. details are at the link. your mileage may vary

chris wellons wrote an excellent article last year about his approach to arena allocation in c in https://nullprogram.com/blog/2023/09/27/

for those who aren't aware, garbage collectors for languages like js, c#, ocaml, or java are usually generational copying collectors for efficiency, which means they use what is effectively an arena allocator for most new allocations. often they reserve a cpu register or two for this, which my c implementation above can't

for c, the apache portable runtime 'apr' has a generic arena allocator in it called pools, which also supports destructors (the drop trait, in rust) and nested pools

the arena allocator in gcc is called 'obstacks' and got moved into libiberty so you can use it in other c programs. or hypothetically in rust

asplake 12 daysReload

> To release memory we simply release the blob itself, so it's like a batched deallocation, sort of. The consequence here is that all of our types must be:

> 1. trivially copyable (: Copy if we speak in Rust)

> 2. trivially destructible (i.e. they can't have impl Drop)

2 (Drop) seems obvious enough, but why 1 (Copy)? There’s a bit of Rust knowledge implicit there that I’m unaware of, and I might not be alone in that. Could someone explain?

Cloudef 12 daysReload

Arena allocation is very trivial in zig

olliej 12 daysReload

Parsing is fun, but assuming a good baseline allocator arena allocation doesn’t get you a whole lot these days (maybe it’s better under a GC environment where tear down does not require a tree traversal?), especially if you’re able to lazily build the AST (this is what I did in JSC years ago).

I still feel that there’s a lot of practical parsing stuff that isn’t considered in academia which leads to a near constant preference for LALR and LR parser generators despite them being honestly worse than recursive descent in almost every important metric if you want to do anything at all other than producing a completely generic AST for known good data.