Process, Analyze, and Transform Python Code with ASTs

Stefanie Molin

Python Language & Ecosystem
Python Skill Intermediate
Domain Expertise Novice

This tutorial will be a roughly 50/50 split of lecture and exercises. Attendees will get hands-on experience working with ASTs in Python, using only the standard library. By recreating common code-quality checks from scratch, attendees will both learn how common tools work under the hood and how to work with the AST in an easy-to-understand fashion. If I’m short on time, I will cut from the final section as that builds upon the previous ones and having a foundation from the other topics will be a more than sufficient base for attendees to continue their studies after the tutorial.

  1. Introduction to ASTs
    1. Introduction to me and the plan for the session
    2. Introduce the term and concept of Abstract Syntax Trees (ASTs)
    3. Mention some of the ways ASTs are used by Python itself and by popular tools
    4. Parsing code into an AST and printing it
    5. Converting an AST into source code again
    6. Exercise break
  2. AST traversal, part 1
    1. Walking the tree
    2. Overview of AST node types encountered during our walk (ast.Module, ast.ClassDef, ast.FunctionDef, etc.)
    3. Exercise break
  3. AST traversal, part 2 (90 minutes)
    1. Creating an ast.NodeVisitor for basic traversal covering the visit() and generic_visit() methods
    2. Exercise break, preceded with an introduction to any new AST node types, where necessary
    3. Visualization of how the traversal was performed and discuss the need to track node ancestry in some applications as a result
    4. Tracking node ancestry during traversal with a stack
    5. Exercise break, preceded with an introduction to any new AST node types, where necessary

Stefanie Molin

Stefanie Molin is a software engineer at Bloomberg in New York City, where she tackles tough problems in information security, particularly those revolving around data wrangling/visualization, building tools for gathering data, and knowledge sharing. She is also a core developer of numpydoc and the author of “Hands-On Data Analysis with Pandas: A Python data science handbook for data collection, wrangling, analysis, and visualization,” which is currently in its second edition and has been translated into Korean and Chinese. She holds a bachelor’s of science degree in operations research from Columbia University's Fu Foundation School of Engineering and Applied Science, as well as a master’s degree in computer science, with a specialization in machine learning, from Georgia Tech. In her free time, she enjoys traveling the world, inventing new recipes, and learning new languages spoken among both people and computers.