BFScript: Bridging the Gap Between Readability and Brainfuck
The Idea: What It Is and Why I Built It
Ever since I first encountered esoteric programming languages, Brainfuck stood out. Its extreme minimalism was fascinating, but also intimidating. It’s Turing complete, meaning theoretically you can compute anything with it, but practically, writing or reading anything beyond simple examples is incredibly difficult. The idea of simplifying this process got stuck in my head.
This led me to create BFScript: a compiler that takes code written in a simpler, C-like syntax and translates it into functional Brainfuck code.
My initial attempt was a different project, the Brainfuck Transpiler. However, I soon realized that approach had fundamental limitations and wasn’t truly Turing complete. It couldn’t handle the complexity I envisioned. So, I decided to start over with a more robust compiler approach, which became BFScript.
Primarily, this is a passion project exploring compiler design in a severely constrained environment—how do you build a usable language when your target has no stack, no registers, and only 8 instructions? It’s for me, for the fun of tackling a weird challenge, and maybe for anyone else intrigued by the intersection of conventional programming and esoteric languages.
What is Brainfuck, Anyway?
Before diving into BFScript, it helps to understand the target language. Brainfuck uses only eight simple commands to manipulate a tape of memory cells:
Command |
Description |
> |
Increment the data pointer. |
< |
Decrement the data pointer. |
+ |
Increment the byte at the pointer. |
- |
Decrement the byte at the pointer. |
. |
Output the byte at the pointer. |
, |
Input a byte to the pointer. |
[ |
Jump forward if byte is zero. |
] |
Jump backward if byte is non-zero. |
A simple “Hello World!” in Brainfuck looks like this:
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
As you can see, readability isn’t its strong suit. BFScript aims to fix that!
The Journey: From Concept to Reality
After hitting the limits with the simple transpiler, I knew I needed a more structured approach for BFScript. I decided to build a proper compiler using Python.
The key technology choices were:
- Python: I chose Python because I’m comfortable with it, and it has excellent string manipulation capabilities and libraries, which are crucial for code generation. Its readability also helps manage the compiler’s complexity.
- Lark (Parsing Library): Instead of writing a parser from scratch, I used Lark. It allows defining the grammar of the BFScript language in a clean way and automatically generates a parser that turns BFScript code into a structured tree (Abstract Syntax Tree - AST). This saved a massive amount of effort and let me focus on the harder part: translation.
The compilation process generally involves:
- Parsing: Lark reads the BFScript code (
.bfs
file) and validates its syntax, creating an AST.
- Code Generation: My Python code walks through this AST. For each node (like a variable declaration,
while
loop, output
call), it generates the corresponding sequence of Brainfuck commands. This involves figuring out how to manage Brainfuck’s memory tape to represent variables and control flow.
The BFScript language itself evolved to include features essential for non-trivial programs:
- Variables (
size_t name = value;
)
- Arithmetic (
+
, -
)
- Loops (
while (condition) { ... }
)
- Basic I/O (
output('A');
, output(variable);
)
Here’s an example of BFScript code that prints a pyramid, showcasing its readability compared to raw Brainfuck:
// --- Pyramid Printer ---
// Prints a pyramid of '*' characters using nested loops.
size_t height = 7; // Declare and initialize a variable
size_t current_row = 1;
size_t chars_for_this_row = 1;
// Loop for each row
while (current_row <= height) {
// --- Print leading spaces ---
size_t spaces_needed = height - current_row;
size_t spaces_printed = 0;
while (spaces_printed < spaces_needed) {
output(' '); // Output a character literal
spaces_printed = spaces_printed + 1;
}
// --- Print the characters ('*') ---
size_t chars_printed = 0;
while (chars_printed < chars_for_this_row) {
output('*');
chars_printed = chars_printed + 1;
}
// --- Print a newline character ---
output('\n');
// --- Prepare for the next row ---
current_row = current_row + 1;
// Add 2 characters for the next row (1 -> 3 -> 5 -> ...)
chars_for_this_row = chars_for_this_row + 2;
}
This is much easier to understand and maintain!
How It Actually Works: The Technical Bits
Okay, so the interesting part isn’t just that it compiles to Brainfuck—it’s how. I had to solve some genuinely tricky problems to make this work, and honestly, I’m pretty proud of the solutions I came up with.
The Code Generation Pattern: Functions All The Way Down
Here’s the thing I figured out early on: I couldn’t just immediately spit out Brainfuck strings as I walked the AST. Why? Because when you’re parsing something like a + b
, you don’t yet know where the result needs to go. Maybe it’s going into a variable. Maybe it’s being output directly. Maybe it’s part of a bigger expression like (a + b) * c
.
So instead, I built a system where every expression returns a Python function—a code_func
—that generates the actual Brainfuck code when you call it:
def code_func(result_cell):
c = []
# Generate BF instructions that put the result in result_cell
return c
This result_cell
parameter is the key. It lets me decide later where the result should end up. When I finally know the context (like “store this in variable X”), I call the function and pass in the destination cell.
For something like a + b * c
, these functions nest naturally:
- Parse
b * c
→ get back a code_func_mult
- Parse
a + ...
→ create a new code_func_add
that calls code_func_mult
internally
- When I finally execute this chain, it recursively generates the right BF code in the right order
I later found out this is similar to something called “tagless final” from programming language research, which was kind of validating—I’d reinvented a real technique without knowing it existed.
Memory Management: Keeping Track of the Tape
Brainfuck’s tape is your only storage. No stack, no heap, just cells stretching out in both directions. I needed a way to manage this sanely.
Variables get permanent locations—each size_t
gets assigned a specific cell number and stays there. But temporary calculations (like the intermediate result of b * c
in a + b * c
) need temporary cells that get cleaned up after use.
I built a MemoryManager
class with a pool of temp cells (0–19 by default):
class MemoryManager:
def get_temp_cell(self):
if not self.temp_cell_pool:
raise MemoryError("Out of temporary cells!")
return self.temp_cell_pool.pop(0)
def release_temp_cell(self, cell):
self.temp_cell_pool.insert(0, cell) # LIFO reuse
The LIFO (last-in-first-out) reuse pattern is intentional—recently freed cells get reused first, which keeps the data pointer from wandering all over the tape. It’s basically register allocation for a tape machine.
If you run out of temp cells, the compiler throws an error instead of silently corrupting memory. Better to fail loudly than generate broken code.
Control Flow: Making Loops and Ifs Work
This was the hardest part. Brainfuck only has [...]
(loop while cell ≠ 0). No if
/else
, no break
, no labels you can jump to. And every operation can move the data pointer, so if you’re not careful, you’ll end up in the wrong place and everything breaks.
My solution was to build abstractions that enforce pointer discipline. For example, here’s loop_managed
:
def loop_managed(self, condition_cell, loop_func):
code = self.move_to_cell(condition_cell)
code += self.open_brace() # [
code += loop_func() # Execute the loop body (might move pointer)
code += self.move_to_cell(condition_cell) # Force pointer back
code += self.close_brace() # ]
return code
The key insight: after the loop body runs, explicitly move the pointer back to the condition cell. This prevents pointer drift bugs where the loop breaks because you’re checking the wrong cell.
For if/else
, it’s trickier. I copy the condition to a temp cell, then use [...]
to run the if-branch. For the else-branch, I set a flag beforehand and clear it if the if-branch runs—then I use another [...]
to check if the flag is still set. It’s hacky but it works.
Comparisons: The Countdown Algorithm
You can’t just check if a > b
in Brainfuck. No comparison instructions exist. So I had to get creative.
My solution for greater_than
:
- Copy
a
and b
to temporary cells (let’s call them temp1
and temp2
)
- Loop: decrement both
temp1
and temp2
by 1 each iteration
- If
temp2
hits zero first, then a > b
(set result to 1)
- If
temp1
hits zero first or they hit zero together, then a ≤ b
(result stays 0)
It’s essentially simulating a comparison by counting down both numbers in parallel and seeing which runs out first. Not the most efficient algorithm, but it works and it’s provably correct.
Similar logic applies to <
, ==
, etc. Each comparison has its own little algorithm.
Why This Matters
I didn’t find tutorials for any of this. The problem space—compiling a C-like language to Brainfuck with complex expressions and control flow—basically doesn’t exist elsewhere. These patterns emerged from trial and error, trying to keep the compiler maintainable as I added features.
The closure-based approach especially felt like a breakthrough moment. It made composition natural, prevented a ton of bugs, and kept the code surprisingly clean considering what it’s doing.
Navigating Challenges: Hurdles and Solutions
This project was definitely challenging, pushing me to learn quite a bit.
The Outcome: Where It Stands and What I Learned
BFScript is currently functional and usable. You can write programs like the pyramid example above and compile them into working Brainfuck code. While there’s always room for improvement and more features (some edge cases in conditionals are still buggy, and I’m not going back to debug that nightmare), I’m happy with its current state as a proof-of-concept and a learning tool.
- Goals Achieved: Yes, the main goal of creating a compiler that translates a C-like syntax into Turing-complete Brainfuck, overcoming the limitations of my previous transpiler, was met.
- Key Learnings:
- A lot about compiler fundamentals (parsing, ASTs, code generation).
- Deep appreciation for the challenges of working in highly constrained environments like Brainfuck.
- How to map high-level programming constructs to low-level operations.
- The value of using good tools and libraries (like Lark).
- Problem-solving and debugging techniques for unconventional code.
- That I could design compiler patterns independently that turn out to mirror real academic research.
- Proudest Aspect: Honestly, the closure-based code generation system. It emerged organically as I tried to handle nested expressions, and realizing later that it’s similar to patterns from academic PL research (tagless final) was really validating. That “independent discovery” moment felt great.
- Future Ideas: While not actively planned, I’ve considered exploring optimizations for the generated Brainfuck code (making it shorter or faster—there’s definitely redundant
+-<>
sequences I could collapse). The idea of using LLVM Intermediate Representation (IR) as a source, allowing potentially any language that compiles to LLVM to be compiled to Brainfuck, is also an interesting, though very ambitious, future thought experiment.