This post is a compilation of great resources I found while building a type checker for Python. These resources are free and highly focused on specific topics, making them ideal for learning by doing rather than going through extensive materials.
Parser
When it comes to writing a parser, your approach depends on your project’s goals. For compilers or interpreters, you can use a parser generator. However, if you’re working on tools like formatters or language servers, your parser needs to handle broken code gracefully. This can be either done with a tool like treesitter that can handle broken code to some extent and also by writing your own. Of course writing your own is more fun.
- “Write JS Parser in Rust” by Boshen is an excellent introductory guide.
- For resilient parsing, check out this tutorial on resilient LL parsing.
- Your language’s official documentation. For Python, there is Python AST module.
- Look into implementation of open source linters or compilers. RustPython Lexer is a good one for python.
Compilers & Interpreters
“Crafting Interpreters” is an essential resource for compilers. I recommend reading it chapter by chapter as you build your project.
For a comprehensive understanding of relevant topics, consider following the Stanford Compilers Class. Although I haven’t watched it personally, I found this guide based on the class quite helpful.
Reading through finished implementations is pretty important. Programming Languages Zoo is one resource for this.
Symbol Table
For symbol table you need to check the language implementation and know the scoping rules, private/public, and different kinds of symbols. There’s no all in one solution.
This series on the Python symbol table implementation from Eli Bendersky is useful for learning how does a symbol table works.
- RustPython’s SymbolTable implementation.
Semantic Analyzer
While resources specific to the semantic analysis phase are scarce, you can find inspiration and solutions in existing projects:
Type Checking
For type checking you are mostly interested in the type rules in that language. Therefore it’s good to check other type checker implementations. They will teach you the rules and how to do it.
- design of pyanalyze for Python.
- For MyPy, the Type Checker wiki.
- internal details of Jedi language server.
- Pyright internals
LSP (Language Server Protocol)
For a comprehensive understanding of language servers, file systems, updates, and testing, check out this Explaining Rust AnalyzerYouTube playlist from Matkald.
LSP specifications are very easy to read. It’s long but you don’t need everything in the beginning. To skip the part of defning every structure yourself you can use Tower LSP.
Linters
Same as with type checking, for linters it’s best to look into implementations and learn from them.
The following tools are useful to understand how analysis is done and errors are reported:
Final Words
Compilers are super fun. If you have more resources please send them to me.