You really do need to be modifying the AST and not just applying a regex to the source code text. This can be a challenge if you want to modify python syntax as you first need to get an AST you can modify, personally I recommend starting with the LARK library and modifying the python syntax grammer it includes.
I use similar techniques to transpile openscad code into a python AST in my "pySdfScad" project, while still theoretically getting the benefits of fancy tracebacks and debugging and the like. Probably should have gone with a simple parser instead, but what can you do.
I think they should have stopped at the "cursed way" and not the "truly cursed way", if they really wanted the syntax changes than having your own python parser like the LARK implementation I mention above is a must.
It's true, the truly cursed method should never be used in a project that you actually aim to use. But, I just wanted to see how close I could get to the original idea, regardless of how "bad" the solution ended up being.
The second method indeed goes very far! I've made some very cool libraries and talks with that alone.
Then, there is also the new frame evaluation API (https://peps.python.org/pep-0523/), which allows you to dynamically rewrite bytecode. This has been used by PyTorch 2.0 for torch.compile.
I did something similar to option 3 to make the builtin numeric types "callable", since the dunder __call__ methods can't be overwritten for builtins. For example, in regular arithmetic notation, something like 6(7+8) could be read as 6*(7+8), but trying to do this in Python gives you `SyntaxWarning: 'int' object is not callable;`, since you're essentially trying to do a function call on the literal 6. The workaround was to use a custom codec to wrap all integer literals to give them the expected call behavior.
This was inspired by a way less silly usecase, future f-strings, which added f-string support to older versions of Python in a similar way using codecs: https://github.com/asottile-archive/future-fstrings
True, but I think that source transformation on import are already supported in Python ("officially") based on importlib [0-1].
Here's an example use cases where the author is trying to create a Python enum from Thrift specs: [2]. The issue here is that the spec code is not necessarily valid Python, but can be fixed easily with some search & replace operations.
One of the benefits of implementing your own module finder and loader is that you could use a separate file extension to avoid confusion with actual Python code.
> The idea of using (abusing in fact) codecs to pre-process python code before it gets to the actual python interpreter is just fantastic!
Coming from Javascript which has Babel, this is kind of an everyday occurance. Through the magic of source code transforms You can easily add whatever experimental new JS features you want to your project!
It's better to make exception handling explicit. It has the same semantic complexity comparing to passing possible good exception types into retry or attempt.
Inline loop logic to decide if iteration is successful can be tricky.
I wonder if the codec could use python's lexer (assuming it's exposed) to parse the for loops and nothing else. Then replace the loops with a placeholder, and then replace the placeholder in the AST after a parse. Might be cleaner than source->source transform by the codec, maybe not.
Where did the "cursed" adjective start being used like this? It's not used in my country, and I've only seen it being used in the last couple of years as part of subreddit titles.
Codecs free to transform all code before it’s executed? Sounds like the perfect place for hackers to hide RCEs very difficult to spot. Just hide an innocent comment at the top of a file.
1. I wish more HN posts ended in "for fun" -- so much of what makes being a progammer fun and enjoyable is plumbing the depths of what is possible, not because it's "best practice" or whatever. I think these types of forrays are where true mastery comes from.
2. The less reserved keywords a language has, the better. It makes things like this easier. Years ago I figured out how to implement Goto in Smalltalk "for fun". It was easier because Smalltalk had less reserved keywords.
I love 4 fun posts here, it's a shame they aren't more common, as usually when they're here someone inevitably with their nose as high in the air as possible chimes in with how this isn't profitable or a business model and will die soon.
This adds c style for loops to hy, which runs on the normal cpython vm, and therefore can make use of the thousands of open source python libraries available.
My reply to the post was tongue in cheek. The point is that you can add on arbitrary language features to python using hy (any lisp, really). The syntax in question is normally referred to as an s-expression.
no, just saying you can trivially convert any lisp code to c style syntax by moving the function call parenthesis to the right and converting wrapping parens into curly braces.
You really do need to be modifying the AST and not just applying a regex to the source code text. This can be a challenge if you want to modify python syntax as you first need to get an AST you can modify, personally I recommend starting with the LARK library and modifying the python syntax grammer it includes.
https://lark-parser.readthedocs.io/en/latest/examples/advanc...
I use similar techniques to transpile openscad code into a python AST in my "pySdfScad" project, while still theoretically getting the benefits of fancy tracebacks and debugging and the like. Probably should have gone with a simple parser instead, but what can you do.
I think they should have stopped at the "cursed way" and not the "truly cursed way", if they really wanted the syntax changes than having your own python parser like the LARK implementation I mention above is a must.