Skip to main content
EuroPythonCode of ConductBuy tickets

Automated Refactoring Large Python Codebases

Room:
Liffey A
Start:
13:35 on 15 July 2022
Duration:
30 minutes

Abstract

Like many companies with multi-million-line Python codebases, Carta has struggled to adopt best practices like Black formatting and type annotation. The extra work needed to do the right thing competes with the almost overwhelming need for new development, and unclear code ownership and lack of insight into the size and scope of type problems add to the burden. We’ve greatly mitigated these problems by building an automated refactoring pipeline that applies Black formatting and backfills missing types via incremental Github pull requests. Our refactor applications use LibCST and MonkeyType to modify the Python syntax tree and use GitPython/PyGithub to create and manage pull requests. It divides changes into small, easily reviewed pull requests and assigns appropriate code owners to review them. After creating and merging more than 3,000 pull requests, we have fully converted our large codebase to Black format and have added type annotations to more than 50,000 functions. In this talk, you’ll learn to use LibCST to build automated refactoring tools that fix general Python code quality issues at scale and how to use GitPython/PyGithub to automate the code review process.

TalkSoftware Engineering & Architecture


The speaker

Jimmy Lai

Jimmy Lai is a Software Engineer in Instagram and Carta Infrastructure. He love Python and like to share his love in tech talks. His recent interest is automated refactoring and his prior sharing topics include profiling, optimization, asyncio and type annotations.



← Back to schedule