24. Changing CPython’s Grammar¶
24.1. Abstract¶
There’s more to changing Python’s grammar than editing
Grammar/Grammar
. This document aims to be a
checklist of places that must also be fixed.
It is probably incomplete. If you see omissions, submit a bug or patch.
This document is not intended to be an instruction manual on Python grammar hacking, for several reasons.
24.2. Rationale¶
People are getting this wrong all the time; it took well over a
year before someone noticed
that adding the floor division
operator (//
) broke the parser
module.
24.3. Checklist¶
Note: sometimes things mysteriously don’t work. Before giving up, try make clean
.
Grammar/Grammar
: OK, you’d probably worked this one out. :-) After changing it, runmake regen-grammar
, to regenerateInclude/graminit.h
andPython/graminit.c
. (This runs Python’s parser generator,Python/pgen
).Grammar/Tokens
is a place for adding new token types. After changing it, runmake regen-token
to regenerateInclude/token.h
,Parser/token.c
,Lib/token.py
andDoc/library/token-list.inc
. If you change bothGrammar
andTokens
, runmake regen-tokens
beforemake regen-grammar
.Parser/Python.asdl
may need changes to match the Grammar. Then runmake regen-ast
to regenerateInclude/Python-ast.h
andPython/Python-ast.c
.Parser/tokenizer.c
contains the tokenization code. This is where you would add a new type of comment or string literal, for example.Python/ast.c
will need changes to create the AST objects involved with the Grammar change.- The Design of CPython’s Compiler has its own page.
- The
parser
module. Add some of your new syntax totest_parser
, bang onModules/parsermodule.c
until it passes. - Add some usage of your new syntax to
test_grammar.py
. - Certain changes may require tweaks to the library module
pyclbr
. Lib/tokenize.py
needs changes to match changes to the tokenizer.Lib/lib2to3/Grammar.txt
may need changes to match the Grammar.- Documentation must be written!