Accepted sessions
Note: this list might change
Working with Audio in Python (feat. Pedalboard)
Come hear about how to play with audio in only a couple lines of Python!
Python can do (nearly) anything, but using Python to work with audio has always been a complicated and messy affair. In this talk, we'll be going through how digital audio works, how Python can be used to play with audio data, and how a new library - Pedalboard - can help. Pedalboard is a simple, fast, and performant library for doing common audio tasks in Python, including applying effects, using VSTs and audio plugins, and encoding/decoding various audio formats.
TalkPython Libraries
Python objects under the hood
Have you ever heard of Python's magic methods? I am sorry, but they are not that âmagicâ! I agree they are really cool, but dunder methods (the name they usually go by) are just regular Python methods that you implement! And it is my job to help you learn about them.
Dunder methods are the methods that you need to implement when you want your objects to interact with the syntax of Python.
Do you want len
to be callable on your objects? Implement __len__
.
Do you want your objects to be iterables? Implement __iter__
.
Do you want arithmetics to work on your objects? Implement __add__
(and a bunch of others!).
Just to name a few things your objects could be doing.
In this training, we will go over a series of small use cases for many of the existing dunder methods: we will learn about the way in which each dunder method is supposed to work and then we implement it. This will make you a more well-rounded Python developer because you will have a greater appreciation for how things work in Python. I will also show you the approaches I follow when I am learning about a new dunder method and trying to understand how it works, which will help you explore the remainder dunder methods by yourself.
For this training, you need Python 3.8+ and your code editor of choice.
Get the slides, exercises, and other resources on Github.
Tutorial(c)Python Internals
Trans*Code
Trans*Code is a free full day workshop & hackday open to trans and non-binary folk, allies, coders, designers and visionaries of all sorts.
Trans*Code events aim to help draw attention to transgender issues through informal, topic-focused hackdays. Coders, designers, activists, and community members not currently working in technology are also encouraged to participate.
Come join members of the trans and non-binary community and allies for this day of hacking, sharing, community, and fun.
Special WorkshopEvents
Managing complex data science experiment configurations with Hydra
Data science experiments have a lot of moving parts. Datasets, models, hyperparameters all have multiple knobs and dials. This means that keeping track of the exact parameter values can be tedious or error prone.
Thankfully you're not the only ones facing this problem and solutions are becoming available. One of them is Hydra from Meta AI Research. Hydra is an open-source application framework, which helps you handle complex configurations in an easy and elegant way. Experiments written with Hydra are traceable and reproducible with minimal boilerplate code.
In my talk I will go over the main features of Hydra and the OmegaConf configuration system it is based on. I will show examples of elegant code written with Hydra and talk about ways to integrate it with other open-source tools such as MLFlow.
TalkPyData: Software Packages & Jupyter
Self-explaining APIs
To mash up various APIs, data need to have a well defined meaning: imagine meshing up healthcare APIs using different units for human temperature, or financial APIs using different currencies.
This talk describes strategies and python tools to overcome these problems in large API ecosystems such as data exchanges between different countries.
TalkSoftware Engineering & Architecture
LocalStack: Turbocharging dev loops and team collaboration for cloud applications
With the staggering dominance of public cloud providers, dev teams across the globe are increasingly focusing time and energy on optimizing their cloud development and deployment flows. The traditional deploy-and-test cycles against public clouds can become slow and tedious, where developers are often facing several minutes of idle times between deployments that need to be frequently triggered during testing & debugging.
In this session, we provide a hands-on introduction to LocalStack (39k+ Github stars), a fully functional local AWS cloud stack. With LocalStack, applications can be developed entirely on your local machine, reducing dev&test cycles from minutes to seconds.
The session covers interactive live coding to showcase common scenarios and use cases, different settings for local debugging of Lambdas and containerized apps (e.g., ECS/EKS), as well as some advanced new features that can radically improve productivity and team collaboration patterns. We will also glance over the large ecosystem of tools that LocalStack natively integrates with - from IaC frameworks like Terraform or Pulumi, to application frameworks like Serverless or Architect, to a whole suite of tools provided by AWS itself (CDK, SAM, Copilot, Chalice, etc).
We'll wrap up the session with a deep dive into some of the Python internals of LocalStack, which reveals some interesting architectural patterns and hidden gems!
TalkInfrastructure: Cloud & Hardware
Python for Arts, Humanities and Social Sciences
Computational methods particularly those involving data analytics are now taking root in various humanities disciplines. However, students and researchers working in these disciplines lack the necessary programming proficiency and coding experience . The need then arises to make Python-based computational methods accessible â we present case-studies of how to do this via various Python modules being taught at College of Business in Technological University Dublin and by means of walkthrough of an interdisciplinary social good project called InEire. It comes down to complementing existing quantitative and qualitative methods with methods based on analysis of various types of data specific to the social science problem being solved. We essentially go through the process of building curiosity-driven exploration in social science students via a theoretically driven research question rather than the Python technique itself, and then focusing on the various steps involved in solving that question; and finally boiling it down to a concrete Python-based data analytics methodology. This project-based teaching methodology helped us develop Python skills in newbies eventually leading to a Python-based data analytic skills in students of disciplines other than Computer Science.
TalkEducation, Teaching & Further Training
Closing Session
Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session Closing Session
Closing Session
PySnooper: Never use print for debugging again
I had an idea for a debugging solution for Python that doesn't require complicated configuration like PyCharm. I released PySnooper as a cute little open-source project that does that, and to my surprise, it became a huge hit overnight, hitting the top of Hacker News, r/python and GitHub trending. In this talk I'll go into:
- How PySnooper can help you debug your code.
- How you can write your own debugging / code intelligence tools.
- How to make your open-source project go viral.
- How to use PuDB, another debugging solution, to find bugs in your code.
TalkSoftware Engineering & Architecture
Memory Problems, Did Collector Forgot to Clean the Garbage?
Memory Problems are the worst nightmare of every developer whose code is serving large files in a production environment. If you ever faced issues of memory leaking in application or if frequent unexpected Out of Memory Exception is raising your anxiety levels, then this talk is for you. This talk aims to summarize the common Memory issues in Python. It is overwhelming to see them even when logic in code is properly optimized. However it is more scary that some of these errors are hard to find and harder to fix.
Talk(c)Python Internals
Demystifying Pythonâs Internals: Diving into CPython by implementing a pipe operator
Diving into the CPython source code can feel daunting. Whether you want to start contributing or just want to get a better understanding of Python by exploring its source code, itâs often difficult to know where to start or what youâre missing.
In my talk, I will show you around the CPython source code by implementing a new operator, a pipe operator. While doing so, I will discuss core parts of the internals, such as Pythonâs grammar, its syntax trees, and the underlying logic that will perform the operation. By the end, you will have a good idea of the moving parts involved in core language features.
I will also take you through the steps necessary to make it all work. Iâll show you how I obtained a copy of the source code, regenerated the parser and token files, and how I compiled my modified version of CPython. I will also write and run tests to help me implement my changes. This should give you a mental framework that helps you while diving into more comprehensive resources, like the excellent Python Developerâs Guide.
My talk is aimed at everyone who wants to explore CPythonâs internals. You donât have to be an expert in Python, although some affinity with Python helps with understanding the internals. I will also use C to implement some of the operator logic, but knowledge of C is by no means required. In short, if youâre interested in diving into the CPython source code, this talk is for you.
Talk(c)Python Internals
An Introduction to Apache TVM
Apache TVM is an open source machine learning compiler framework for CPUs, GPUs, and machine learning accelerators. It aims to enable machine learning engineers to optimize and run computations efficiently on any hardware backend.
This talk will present an introduction to Apache TVM using its Python API, and demonstrated using examples of deep learning models being execute in CPUs and Microcontrollers.
TalkPyData: Deep Learning, NLP, CV
Why is it slow? Strategies for solving performance problems
You have a performance problem, and you don't know what to do. All you know is that one of your endpoints is very slow; and perhaps it only affects a certain user. How do you figure out why it's slow, and what can you do to catch performance problems before they hurt users in production? This talk will step through several scenarios involving typical performance problems and how to diagnose them.
TalkSoftware Engineering & Architecture
Creating great user interfaces on Jupyter Notebooks with ipywidgets
Jupyter notebooks are great to quickly try new ideas and experiments, but the downside is that using code to change inputs and see the results can be inefficient and error-prone. ipywidget is a Python library that solves this problem by providing a user-friendly interface with iterative widgets. It's all in Python so we don't have to worry with any CSS or Javascript. In this talk we'll learn how ipywidgets can help us build tools in the context of Data Science.
TalkPyData: Software Packages & Jupyter
Writing secure code in Python
The talk will analyze a series of vulnerabilities that given some common mistakes might end up damaging your Python programs (with lots of exemples!). At the end, a precaution and audit method will be presented.
TalkSecurity
Lessons learnt from building my own library
One of the many strengths of python is PyPI, which complements and enhances the "batteries included" approach of the standard library. Building a library, and publishing it to PyPI has a number of challenges, pitfalls, and choices that someone has to make. In this talk, I would share my journey from v0.1.0 to v1.0.0 and all the moments that I said: "I wish I knew this thing before".
TalkPython Libraries
The intricate art of making your (internal) clients happy - the story from a Python-centered Infra team
If you have ever worked on an internal company project, you may feel it deep in your bones. Letâs say that you discovered a need for a generic technological component in your organizationâs tech stack. You identified stakeholders, gathered requirements, and started agile iterations on providing it. Then comes a day when you can show the MVP to your internal client! Yet⊠the client has lost his interest: maybe right now he says that he has already come up with his own temporary solution and he has no intention to switch to another one? Building internal products differs from commercial ones - there is no flow of cash and your clients are fully transparent.
In this talk, I would like to share with you my experience and tips connected with developing such internal tools and standards. All of this from the perspective of a member of the Machine Learning Infra team that is delivering its solutions to a rapidly growing ML department in a company whose product is used by 300 million unique users per month.
But letâs be specific! I will talk about:
- Common pitfalls and try to dig up the reasons for why they happen when developing internal solutions
- How one can approach delivering tools (spoiler: pilot programs, guilds, and more!)
- Learnings from introducing such approaches (what worked, what didnât) in our case
TalkSoftware Engineering & Architecture
Morning Announcement
Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement
Opening Session
Saving Lives with Predictive Geo - AI
Leveraging geospatial Python libraries to understand and predict High-risk houses during cyclone-induced floods in urban areas considering historical openly available satellite images and urban morphological data.
Assigning a flood risk score to each individual house near the coastal regions is a challenge. Also, as the land characteristics vary based on different geographical locations, prepare for emergencies on demand. â
â
TalkPyData: Machine Learning, Stats
The Design of Everyday APIs
What makes a good API for a library? Or more importantly, what makes it bad? This talk will discuss the principles of what goes into user-centered design, and how best to apply those principles when writing a Python library for fellow developers.
TalkPython Libraries
Use animated charts to present & share your findings with ipyvizzu
Sharing and explaining the results of your analysis can be a lot easier and more fun when you can create an animated story of the charts containing your insights. ipyvizzu - a new open-source charting tool for Jupyter & Databricks notebooks and similar platforms, enables just that with a simple Python interface. In this talk, the creator of ipyvizzu shows how their technology works and provides examples of the advantages of using animation for storytelling with data. Links:
- ipyvizzu - animated charts within the notebooks
- ipyvizzu-story - this adds the presentation functionality
- slides to download
TalkPyData: Software Packages & Jupyter
When to refactor your code into generators and how
Have you ever found yourself coding variations of a loop construct where fragments of the loop code were exactly the same between the variations? Or, in an attempt to factor out these common parts, you ended up with a loop construct containing a lot of conditional code for varying start, stop, or selection criteria?
You might have felt that the end result just didn't look right. Because of the duplicated parts in your code, you noticed that the code didn't conform to the DRY (Don't Repeat Yourself) principle. Or, after an attempt to combine the variations into a single loop, with consequently a lot of conditional code, your inner voice told you that the resulting code had become too complex and difficult to maintain.
This talk will show you a way out of this situation. It demonstrates how you can create a generator function that implements only the common parts of your loop construct. Subsequently you will learn how you can combine this generator function with distinct hand-crafted functions or building blocks from the standard library itertools
module or the more-itertools
package.
As an example, imagine you'd need to implement some varying functionality based on the Fibonacci sequence. This talk shows you how it would look like before and after you've refactored it into a pipeline of generators.
After having seen this pattern, you will recognize more quickly when this kind of refactoring helps you to create more maintainable and more Pythonic code.
TalkSoftware Engineering & Architecture
Packaging security with Nix
Managing securely dependencies is becoming an increasing concern of the industry. Here, we showcase how Nix, a functional-oriented package manager, can get us very far and close class of vulnerabilities that PyPI / pip had in the past, e.g. rogue PyPI packages that steals personal data.
TalkSecurity
Czech Drought Monitoring system â a journey from manual work to global drought monitoring and machine learning, powered by Python
This talk aims to encourage beginner developers not to underestimate the skills and benefits they can bring to various non-IT environments. I joined a team focused on drought research at the Czech Academy of Science in 2016 with a fresh degree in Geoinformatics and minimal experience with coding. Thanks to this very little initial knowledge, we were able to build a robust system providing drought monitoring and forecast for Czechia and also the whole of central Europe. We were able to fight through text files, user inputs, and geodata of all sorts and say goodbye to manual processing thanks to Python and its geospatial and data processing libraries. On the technical side, the presentation should introduce some of the handy geospatial and data processing tools to get your hands on any task, from producing colorful maps to analyzing time trends in satellite imagery. It should also be a guide on identifying needs and building the most necessary data manipulation processes from scratch.
TalkCareer, Life,...
Write Docs Devs Love: Ten Tricks To Level Up Your Tech Writing
Tutorials, blog posts, and product docs help developers learn. From our favorite tutorials to bad product docs we all consume technical writing. But what makes for good technical writing? In this talk Iâll share 10 tips and tricks to improve your technical writing skills to help your readers succeed
TalkCommunity & Diversity
What transitioning from male to female taught me about leadership
Not many leaders transition in their mid thirties but I did and it gave me a unique perspective on courage, humility, diversity and inclusion in the context of leadership. In this talk I will tell the story of my transition and along the way you will learn how you can become a better leader.
TalkCareer, Life,...
Simple data validation and setting management with Pydantic
When processing data, validating its structure and its type is critical. Bad record types or changes in structure can often result in processing errors or worst in wrong data output. Yet, solving this problem cleanly and efficiently can be challenging. It often results in complicated code logic and increases complexity; consequently decreasing code readability. Pydantic is an efficient and elegant answer to these challenges
We expect you'll leave this talk with a good understanding of:
- Existing challenges in data validation
- What Pydantic Models, Validators, and Convertors are
- How to leverage Pydantic in your day to day (using real-life examples)
- [Bonnus] How to use Code Generation to create Pydantic Models from any data sources
TalkPyData: Data Engineering
Dance with shadows: stubs, patch and mock
To ensure quality, automated testing is a must. But sometimes is impossible or very expensive to use real environments. In this case, you can isolate some parts of a system and use fake simulated objects.
TalkTesting
How to setup your development workflow to keep your code clean
Clean code is something every developer should aim for, but how to make sure code is actually clean? How much should be invested in that endeavor? Whose responsibility is it?
In this workshop, we will go through all the aspects and stages to setup your development workflow to help you take ownership of the quality of your code.
We will take a simple application as a starting point and simulate a full development cycle, including coding in the IDE and opening a pull request on GitHub. We will create a CI pipeline triggering code quality monitoring using Sonar tools. More specifically, we will be using SonarCloud as a central platform to monitor code quality and SonarLint to detect issues directly in the IDE.
At the end of the workshop, you will be ready to enable such integration for your own projects.
TutorialSponsor
Secure Python ML: The Major Security Flaws in the ML Lifecycle (and how to avoid them)
Every phase across the end-to-end machine learning lifecycle exposes a plethora of security risks that often go unnoticed by machine learning practitioners. In this talk we uncover the most critical (and common) security risks in the machine learning lifecycle, covering in-depth concepts as well as practical examples of ways in which these can be exploited as well as resolved and mitigated (analogous to the OWASP Top 10 industry standard).
Throughout the talk we will be using a hands on example, where we will be training, packaging and deploying a model from scratch, outlining key risk areas for each step together with tools and practices that can be used to mitigate these risks. By the end of this talk, machine learning practitioners will have a robust intuition of the importance of security best practices throughout the machine learning lifecycle, together with tools and frameworks that can help mitigate undesirable outcomes due to security flaws.
TalkSecurity
Protocols - Static duck typing for decoupled code
Python introduces Protocols to support static duck typing, where static type checkers (mypy) and other tools can verify code correctness prior to runtime.
This was added in order to circumvent explicitly inheriting from ABCs (Abstract base classes) which is "unpythonic and unlike what one would normally do in idiomatic dynamically typed Python code" - according to PEP 544.
We will explore the different use cases for Protocols and how to use them correctly.
TalkPython Libraries
When gRPC met Python
What if we can have a tool that helps us to do intelligent load balancing or What if we can do selective compression of the data and extremely fast and light weight transfer of data? Then let me introduce gRPC, the technology that helps us to do all of this and how can we integrate gRPC with Python.
TalkWeb
Automate cleaning code in few easy steps!
How annoying is it to find out that everything went to hell on the pipeline because you forgot to run the formatters?
Donât waste precious time and learn how it is possible to automate these little things, but most importantly understand why it is important to have them in your code!
TalkPython Friends
TDD in Python with pytest
This workshop will guide you step-by-step through the implementation of a very simple Python library following a strict TDD workflow. At the end of the workshop you will have grasped the main principles of TDD and learned the fundamentals of the Python testing library pytest.
TutorialTesting
Predicting urban heat islands in Calgary
This talk explains how geospatial Python libraries can help us understand and predict Land Surface Temperature in urban areas using historical openly available satellite images and urban morphological data. This makes data science a powerful tool to plan and design urban areas while reducing the impact of urban warming.
TalkPyData: Deep Learning, NLP, CV
How a popular MMORPG made me a better developer
Have you heard of the critically acclaimed MMORPG Final Fantasy XIV?
As an active player since 2015, I've used my "problem-solving programmer brain" to analyze my experiences in the world of Eorzea and apply them into important software lessons. From finding solutions to a housing crisis, to tracking cheaters, to networking with the president of Square Enix and applying the principles of (Y)MINASWAN, there's a lot to be learned through triumphs and failures as an MMO gamer. I will also talk about my experiences in the software community as a neurodivergent developer, and how gaming helped me break down barriers.
TalkCommunity & Diversity
Developers Documentation: your secret weapon
You can have the best product in your expertise area, but if your documentation isnât on par with the flawless experience you want to offer to the world, success is not guaranteed. Letâs be real here: documentation is often an afterthought and rarely included in life cycle development processes. Still, documentation is the secret weapon for greater adoption, and growth that you may have not known you could achieve.
Itâs time for you to step up your game and measure up to the big players. Learn about the benefits of high quality and educational documentation and the true role it plays in the developer community. Youâll also learn the principles of a solid foundation, and tips on how to use one of the most powerful developer relationsâ tools.
TalkEducation, Teaching & Further Training
Multithreaded Python without the GIL
CPythonâs âGlobal Interpreter Lockâ, or âGILâ, prevents multiple threads from executing Python code in parallel. The GIL was added to Python in 1992 together with the original support for threads in order to protect access to the interpreterâs shared state.
Python supports a number of ways to enable parallelism within the constraints of the GIL, but they come with significant limitations. Imagine if you could avoid the startup time of joblib workers, the multiprocess instability of PyTorchâs DataLoaders, and the overhead of pickling data for inter-process communication.
The ânogilâ project aims to remove the GIL from CPython to make multithreaded Python programs more efficient, while maintaining backward compatibility and single-threaded performance. It exists as a fork, but the eventual goal is to contribute these changes upstream.
This talk will cover the changes to Python to let it run efficiently without the GIL and what these changes mean for Python programmers and extension authors.
KeynoteKeynotes
Synergize AI and Domain Expertise - Explainability Check with Python
The talk focuses on establishing guidelines for Explainable AI by diving into fundamental concepts and checkpoints, before accepting AI models to make decisions. We go through explainers, types, and algorithms with a simple implementation in Python, to strengthen our understanding of "WHY?" the model predicts a certain value and "HOW?" to validate it with experiential learning of experts to bridge potential gaps
TalkPyData: Ethics in AI
Mercury - Build & Share Data Apps from Jupyter Notebook
Have you ever wished to magically transform your notebook into a web app and share it with non-coders? The Mercury is a new open-source framework for converting Jupyter Notebook to a web app.
TalkPyData: Software Packages & Jupyter
Making Python better one error message at a time
Error reporting has been an area that sadly has not improved a lot recently in the Python interpreter and users have been battling with very obscure runtime errors and puzzling syntax error messages that range from very generic (just âsyntax error: invalid syntaxâ) to directly misleading (the error displayed for unclosed parentheses). This situation has frustrated users for a long time and has forced everyone into learning âwhat the interpreter really wants to sayâ or âwhere the error really could beâ. This problem is especially acquitted for first-time learners of the language as they can lose a lot of time trying to decipher what the error messages they just got mean and where the problem may be.
Talk(c)Python Internals
AI for Content Moderation at PayPal
Online platforms have a hard time combating hate, hate speech, explicit content and other NSFW material. Most of the solutions are rule based keyword approaches which are brittle and can be bypassed easily. At PayPal, we have a wide range of user generated content and there is a great need to automatically identify and flag hate, explicit and other typologies, to improve user experience and adhere to regulatory policies. In this talk we showcase how AI can help us identify such content with great precision.
TalkPyData: Machine Learning, Stats
Wednesday's Lightning Talks
A lightning talk (LT) is a short presentation that must not be longer than five minutes.
To sign up for a lightning talk, you can put your name on the information board during the conference before the second coffee break. For our online participants, we will set up a separate form or Google sheet for you to put your name and topic in - similar to how we run this at the in-person conference.
We will announce the same every day both online and in person.
Lightning Talk~None of the above
Building a Just-in-Time Python FaaS Platform with Unikraft
Function-as-a-Service (FaaS) platforms are one of the key service offerings for any cloud provider. To provide strong isolation, the functions are run inside heavy-weight virtual machines (and within containers inside those for orchestration reasons). Consequently, such instances take too long to boot and so are kept on all the time, even though the functions only receive requests intermittently. The end result is that current FaaS platforms are much less efficient than they could be.
We will introduce a radically novel way to build FaaS platforms based on Python and the Unikraft Linux Foundation open source project (www.unikraft.org). Unikraft is a toolkit for building fully specialized, cloud-ready virtual machines called unikernels targeting a single application . Using Unikraft we can construct extremely specialized, Python-based unikernels that use only a few MBs to run a boot in 10s of milliseconds, allowing us to bring VMs up as a request to a function comes in, and to shut it down (or suspend it) afterwards. The result: a Python-based FaaS platform that is significantly more efficient and cheaper to operate than existing offerings.
In the talk we will provide an introduction to Unikraft, how Python is built on top of it, a full description of the FaaS platform and a short demo.
TalkInfrastructure: Cloud & Hardware
How To Train Your Graphics Card (To Read)
This tutorial aims to introduce new users to modern NLP using the open-source HuggingFace Transformers library. We'll use massive, pre-existing language models that you might have heard of like BERT and GPT to solve real-world tasks. By the end of this tutorial you'll be able to classify documents, get your computer to read a text and answer questions about it, and even translate between languages!
TutorialPyData: Deep Learning, NLP, CV
Picking What to Watch Next - build a recommendation system
Recommendation algorithms are the driving force of many businesses: e-commerce, personalized advertisement, on-demand entertainment. Computer algorithms know what you like and present you with things that are customized for you. Here we will explore how to do that by building a system ourselves.
TutorialPyData: Machine Learning, Stats
Managing the code quality of your project. Leave the past behind: Focus on new code.
If you try to use Pylint or Flake8 on a legacy project, the results are usually truly overwhelming. There might be thousands of warnings, hundreds of errors and maybe even no unit tests. The usual emotional response to this is distress, exasperation... even despair. And then the question comes: Where do I start?
During this talk we will see why itâs better to set old code aside and focus first on the new code youâre writing. Weâll show some possible approaches and tools that can help you keep the focus and deliver new code with a high level of quality.
TalkDevOps
Build a production ready GraphQL API using Python
This workshop will teach you how to create a production ready GraphQL API using Python and Strawberry. We will be using using Django as our framework of choice, but most of the concept will be applicable to other frameworks too.
We'll learn how GraphQL works under the hood, and how we can leverage type hints to create end to end type safe GraphQL queries.
We'll also learn how to authenticate users when using GraphQL and how to make sure our APIs are performant.
If we have enough time we'll take a look at doing realtime APIs using GraphQL subscriptions and how to use GraphQL with frontend frameworks such as React.
TutorialWeb
Python & Visual Studio Code - Revolutionizing the way you do data science
Visual Studio Code along with GitHub, Codespaces, and Azure Machine Learning have been investing substantially into tools and platforms to make the lives of Python data scientists easier, and we want to share why VS Code is now the #1 tool for Python Data Scientists according to the 2021 Python Software Foundation Developer Survey, and how you can leverage VS Code to take your data science productivity to the next level.
This talk will walk through several common Python data science scenarios, showcasing all the productive tooling VS Code has to offer along the way. As a sneak peek, we will be demoing a best in class Jupyter Notebooks experience with VS Code Notebooks, a revolutionary new data cleaning / preparation experience with Data Wrangler in VS Code, collaboration features with GitHub and Codespaces, Azure Machine Learning for deployment, and more!
Sponsored TalkSponsor
Classifying LEGO Bricks with Machine Learning
There are over 70 000 different Lego bricks and they appear in almost 200 different colors. Even the most hardcore AFOLs (Adult Fan of Lego) donât know all of them. Let alone be able to recognize them. So I got curious whether itâs possible to create an application that can recognize the particular brick using only its photo.
TalkPyData: Deep Learning, NLP, CV
Automate the Boring Stuff with Slackbot(ver.2)
Today, there are many tasks to repeat in the company/community. In addition, we often use chat such as Slack for daily communication. So, I created a chatbot(PyCon JP Bot) to automate various boring tasks related to holding PyCon JP.
In this talk, I will first explain how to create a chatbot using Bolt for Python. I will tell you how to registers bot's integration on Slack and how to create a simple bot in Python that responds to specific keywords.
TalkPython Libraries
CPython bugs & risky features
In this talk we will look into a few bug cases or doubtful features in CPython some of which are still present (and known to bugs.python.org) and may impose a security risk for admins or organizations.
Talk~None of the above
Registration @ Ground Foyer
Please show up on time for registration and bring your E-ticket along with the order code in order for a speedy registration. We'll require you to present a form of ID document (ex: passport) to verify your identity.
Don't be late :)
Registration
PyArrow and the future of data analytics
In this talk we will introduce PyArrow and talk bout the transformation that the Arrow format is allowing in the Data Analytics world.
PyArrow provides an in-memory format, a disk format, a network exchange protocol, a dataframe library and a query engine all integrated in a single library. But the Arrow ecosystem doesn't stop there and allows you to work integrating multiple different technologies. It can be a swiss army knife for data engineers and it integrates zero cost with NumPy and Pandas in many cases.
TalkPyData: Data Engineering
Maps with Django
Keeping in mind the Pythonic principle that âsimple is better than complexâ we'll see how to create a web map with the Python based web framework Django using its GeoDjango module, storing geographic data in your local database on which to run geospatial queries.
TalkDjango
Sponsor Recruitment Session
Many of our sponsors are looking to hire talented people and EuroPython is the perfect place to reach out to them!
In this session, our sponsors will each give a short presentation about their company and what they do with Python. You can then approach them directly at their booth to discuss more details.
TalkSponsor
How to embed a Python interpreter in an iOS app
Come see how you can make a native mobile app that embeds Python 3.10 to allow users to script app behavior. It's allowed by Apple but is currently underutilized by the app makers. Add superpowers to your iPhone app with Python!
Native mobile applications have many advantages over mobile websites or apps made with cross-platform toolkits. They will use less battery, allow for richer graphics, more consistent UI behavior, and enable more functionality through device-specific APIs. Wouldn't it be great to have access to all this from Python?
In this talk, we'll marry a native iOS app written in Swift with an embedded Python 3.10 interpreter to allow users to customize what the application is doing. We'll go through the entire process of:
- embedding Python from source;
- building it into the Swift mobile app in Xcode;
- adding a few pre-compiled third-party libraries like numpy and Pillow to broaden the scope of what the user can do;
- running the resulting app on an iPhone 13;
- modifying the app behavior at runtime thanks to our new Python superpowers!
Knowledge of Swift is not required for attendees of this talk. However, it will be needed later if you're willing to embed Python in an iPhone app. Embedding Python doesn't really let you make an app without knowing Swift. Don't fret though! It's pretty easy to get a hang of Swift when you're fluent in Python.
TalkSoftware Engineering & Architecture
From pip to poetry - Python (many) ways of packaging and publishing
Ever had issues to manage your python packages and environment? Do you know how to create and share a package to the community? It can be challenging if you've never done it, but it also doesn't have to be hard. There is always a better tool to fit our needs.
In this presentation, I'd like to discuss how Python's package managers appeared and evolved with time. Discussing pip, pipenv, and poetry, presenting each of their weak and strong points. Also intend to present how to package and publish a simple code with each one of them, and suggest which package manager should you choose, whether you are just starting with python, or feel like there is something bothering and never knew you could solve it easily and painless.
Slides can be accessed here: https://github.com/vinigfer/europython_2022_slides
TalkPython Libraries
Try Something Different: Explore MicroPython! (a rough guide for newcomers)
MicroPython - a reimplementation of Python for microcontrollers - is nine years old. How can you find your way in a jungle of tiny chips, circuits, and jumper wires? In this session, we will run through a brief introduction to the world of MicroPython. Beyond the basics, we will explore the projects, tools, and the community that helped your intrepid speaker to get started as a newcomer.
TalkMakers
Leading & growing software teams
Software development is a team game.
As you progress through your career, you might end up in a leadership role, taking care of your own team, or even of multiple teams.
As a team lead, itâs up to you to establish a good working rhythm, set the right expectations, communicate up and down the chain of command and effectively help your team grow in both technical and non-technical terms.
As a team lead, you want to enable your team to reach its full potential.
The main goal of this talk is to provide pragmatic real-life examples, about how to achieve those things.
TalkCareer, Life,...
Dodging AI Dystopia: you can't save the world alone
If real life was a superhero movie, weâd have all the ingredients needed for a heroâs rescue. So many âAIâ algorithms are being applied to EU education, employment, and public safety systems that you might wonder if the TV series âBlack Mirrorâ is fiction or a blueprint for nefarious actors. Luckily, there are codes to keep dystopia at bay, whether from the fictional Justice League or from real-life courts of justice. This talk discusses both, and is aimed at software engineers, architects, designers, testers and product/project managers who want to slow the Automation of Everything, but donât know how.
Keynote
Tales of Python Security
Security vulnerabilities receive huge publicity but also significant secrecy. In this session, we will walk through some of the biggest issues of the last few years from the perspective of a member of the Python Security Response Team. You'll learn how we work to protect all CPython users, how you can help, and how you can help protect yourself from malicious attackers.
TalkSecurity
Online voting system used for primary elections for the French Presidential, must be secure right ?
Since it inception, online voting has been an apealing but controversial technology.
Indeed, what seems like a modern way of making vote cheaper and more convenient is often considered by activist and reserchers as a pandora box unleashing never-ending privacy and authenticity concerns.
However with Covid 19 shriking our public interaction, many have considered the benefits overcome the theorical issues and online voting system have skyrocketed like never before...
The Neovote voting system has been massively used in France: tenths of Universty, hundreds of private companies and, more importantly, it was chosen to organise 3 of the 5 main primary elections for the French Presidential election of 2022 (Primaires de l'Ăcologie, Les RĂ©publicains and Primaire Populaire).
Neovote claims to have the highest possible level of security, the voter being even able to access the final ballot box to do the recount by himself and ensure his own vote has been taken into account !
So challenge accepted, this talk will walk you through the Neovote voting system to understand why their claims are "slightly" exagerated ;-)
Talk~None of the above
How we are making Python 3.11 faster
Python 3.11 is between 10% and 60% faster than Python 3.10, depending on the application. We have achieved this in a fully portable way by making the interpreter adapt to the program being run, and by streamlining key data structures.
In this talk I will explain what changes we have made, and how they improve performance.
Talk(c)Python Internals
Machine Translation engines evaluation framework
As an engineers in a ML R&D department of large healthcare enterprise company we were presented with the task to evaluate several Machine Translation engines and choose the one best suited for our corporate needs. To do that we created extendable Python-based framework that allowed us to easily plug-in different Machine Translation engines and compare them across large variety of test datasets with a unified set of quality metrics. Our goal from the start was to create universal MT evaluation framework, that will be useful not only for healthcare domain, but to a wider community as well.
At this talk we will present our evaluation framework an will do a walk-through of its capabilities. We also cover how it can be extended to new MT engines, new test datasets and new language pairs. We will also present our evaluation results for several state-of-the-art machine translation engines, both open-source and cloud-based.
All the source code of our framework is published to open source: https://github.com/Optum/nmt
TalkPyData: Deep Learning, NLP, CV
How much time does it take to write tests? A case study
Writing automated tests takes time. As developers, we are constantly pressed by management to deliver early, which means we are tempted to skip writing some of the tests. Of course, in the long term, the time needed to write tests is paid off.
But how much of our time do we spend in order to write tests? Is it half? Is it three-quarters? This can be difficult to measure, particularly if we are using test-driven development, because in that case writing tests is integrated in the process of writing code.
While I like test-driven development, I can only practice it when I have a good idea of what code I want to write. But sometimes my idea of how to approach the problem at hand is quite vague and I experiment a lot. In these cases, I write the code first and the tests after that.
In one such case I first finished the functionality I was developing and proclaimed it "beta". I then went on to write the unit tests for it. As a result, I have a clear idea how much time I spent writing documentation and main code, and how much I spent writing tests. In this talk I examine the implications of all this.
TalkTesting
I have to Confess, I still Love Pandas
Pandas is the first Python library that I learned to use. It is used by data scientists to manage, transform and inspect data. As more and more open-source tools appear, it seems the spotlight has shifted and I would love to shine some light on this tool that all should know.
TalkPyData: Machine Learning, Stats
Real-time browser-ready computer vision apps with Streamlit
By using Streamlit and streamlit-webrtc, we can create web-based real-time computer vision apps only with ~10 or 20 additional lines of Python code.
To turn computer vision models into real-time demos, we have conventionally used OpenCV modules such as cv2.VideoCapture
and cv2.imshow()
. However, such apps are difficult or impossible to share with friends, run on smartphones, or integrate with modern interactive widgets and other data views and inputs.
Web-based apps don't have such problems.
Streamlit provides an easy way to build web apps quickly, and streamlit-webrtc
allows to use real-time video streams.
You can create real-time video apps with modern interactive views and inputs, and host these apps on the cloud to use from any devices with browsers.
In this talk, I will demonstrate the development process using these libraries and show a variety of examples so that we see how easy and useful they are and can make use of them in daily development and research.streamlit-webrtc
extends Streamlit to be capable of dealing with real-time video and audio streams.
With a combination of these libraries, developers can rapidly create real-time computer vision and audio processing apps for which OpenCV has typically been used.
TalkPyData: Deep Learning, NLP, CV
Music and Code
A playful exploration of the similarities and differences between music and code. What could coders learn from musicians, especially when it comes to learning, training and mentoring? (A personal perspective from someone who has been a professional musician, a professional teacher, and a professional coder.)
TalkEducation, Teaching & Further Training
Creating the Next Generation of Billionaires - Part 4
Our generation of young people in school (aged 5-18) have noticed the connection between Computer pRogramming, Technology, Bitcoinism Success, Climate Change and Billionaires.
On mass young people are clamouring to master the skill of Computer pRogramming. It has been dubbed the â4thâ Râ (computer pRogramming) along with Reading, wRiting and aRithmetic. So, governments worldwide have launched initiatives to have it taught in schools from Kindergarten to all the way to high school. And now young people are successfully mastering this skill.
This talk will describe a case study whereby Computer Programming (Python) was introduced for the first time to a group of young people and how the young people are using it to explore and understand real world problems and data such as those relating to climate change, world population growth and carbon dioxide emissions with Python visualisation libraries such as Matplotlib, Numpy and Pandas. We will talk about the joys and challenges and discoveries made by the young people. We will conclude with suggestions on how to proceed in this area.
TalkEducation, Teaching & Further Training
Build your own Playlist Recommender System with Python using your GDPR Data
In my talk, we explore our usage data requested according to GDPR and leverage it - together with Spotifyâs Web API - to build a personalized playlist recommender system with Python.
In 2018, the General Data Protection Regulation (GDPR) became effective in the EU. It sometimes causes data scientists great headaches. But from a consumer and Pythonista point of view this can also be interesting data for exploration. It is very useful for building personalization technology, in particular recommender systems. And there are almost endless ways to use Python for it. So, letâs request and use our own data to build a playlist recommender system which infers our music taste from our streaming history and uses it to retrieve songs from our favorites in a new way. We will call it âYour Rediscover Pastâ, a personalized playlist based on your streaming history and saved songs.
TalkPyData: Deep Learning, NLP, CV
Let's talk about JWT
JSON Web Tokens, or JWTs for short, are all over the web. They can be used to track bits of information about a user in a very compact way and can be used in APIs for authorization purposes. Join me and learn what JWTs are, what problems it solves, how you can use JWTs, and how to be safer when using JWTs on your applications.
TalkWeb
[CANCELLED] Build your own linters
Despite a ton of wonderful linters out there, it pays off to scratch your itch and learn how to write one yourself. Anytime a pet peeve starts bothering you in code reviews, youâll have all the tools at your disposal.
TutorialTesting
Common Python Mistakes with Kubernetes, How They Can Cause Vulnerabilities and How to Solve Them!
In this session, we will have a look at common mistakes in Python, that can cause serious code vulnerabilities, specifically for Kubernetes deployments of the code. We will subsequently have a look at what those vulnerabilities actually can result in and how your containerized application can get âhackedâ as a result. We will also discuss how developer and security teams struggle to talk in a common language to prevent and mitigate these vulnerabilities. Lastly, we will see how you can prevent and mitigate these vulnerabilities in real-life.
Talk~None of the above
Python's role in unlocking the secrets of the Universe with the James Webb Space Telescope
The James Webb Space Telescope is a groundbreaking infrared observatory resulting from an international collaboration between NASA, the European Space Agency, and the Canadian Space Agency. It was successfully launched on Christmas Day 2021 from Europe's spaceport in Kourou, French Guiana, and is currently orbiting the L2 point 1.5 million km from Earth.
Webb was designed to address some of the biggest questions in astronomy and astrophysics, including identifying the first stars in the Universe, observing the first galaxies, revealing the initial stages of star and planet formation, and probing the composition of exoplanet atmospheres.
But how do we go from the raw data collected by Webb to science-ready data products delivered to astronomers and astrophysicists around the world? How do we embed our understanding of the telescope and its instruments into this process? How did we prepare and test this?
From instrument simulators to the ambitious Webb Calibration Pipeline, the software suites that support these tasks are written in Python. In this talk I will give an overview of Webb, the crucial role of Python in Webb's development and data processing, and I will show and discuss the first publicly released images from this revolutionary telescope.
KeynoteKeynotes
Morning Announcement
Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement Morning Announcement
Opening Session
Build with Audio: The easy & hard way!
The audio (& speech) domain is going through a massive shift in terms of end-user performances. It is at the same tipping point as NLP was in 2017 before the Transformers revolution took over. Weâve gone from needing a copious amount of data to create Spoken Language Understanding systems to just needing a 10-minute snippet.
This tutorial will help you create strong code-first & scientific foundations in dealing with Audio data and build real-world applications like Automatic Speech Recognition (ASR) Audio Classification, and Speaker Verification using backbone models like Wav2Vec2.0, HuBERT, etc.
TutorialPyData: Deep Learning, NLP, CV
Data Warehouses Meet Data Lakes
Many organizations have migrated their data warehouses to datalake solutions in recent years. With the convergence of the data warehouse and the data lake, a new data management paradigm has emerged that combines the best of 2 approaches: the botton-up of big data and the top-down of a classic data warehouse.
TalkPyData: Data Engineering
Inspect and try to interpret your scikit-learn machine-learning models
This tutorial is subdivided into three parts.
First, we focus on the family of linear models and present the common pitfalls to be aware of when interpreting the coefficients of such models.
Then, we look at a larger range of models (e.g. gradient-boosting) and put into practice available inspection techniques developed in scikit-learn to inspect such models.
Finally, we present other tools to interpret models, not currently available in scikit-learn, but widely used. in practice.
TutorialPyData: Machine Learning, Stats
A Tale of two Kitchens, hyper modernizing your codebase.
When starting a new python project, the âhypermodernâ python âtemplateâ is a popular choice. Its style is opinionated and strict, and it brings a consistent style and today's best practices. How do I bring my legacy codebase up to this standard?
Sponsored TalkSponsor
Protocols in Python: Why You Need Them
Protocols have been around since Python 3.8. So what are they, and how can they help you write better code? And how are they different from Abstract Base Classes? In this talk I will introduce you to both concepts (ABCs and Protocols), and show you by example how they can make your life easier, and your code cleaner.
TalkPython Libraries
Thursday's Lightning Talks
A lightning talk (LT) is a short presentation that must not be longer than five minutes.
To sign up for a lightning talk, you can put your name on the information board during the conference before the second coffee break. For our online participants, we will set up a separate form or Google sheet for you to put your name and topic in - similar to how we run this at the in-person conference.
We will announce the same every day both online and in person.
Lightning Talk~None of the above
Debugging asynchronous programs in Python
Recently the interest in asynchronous programming has grown dramatically. Unfortunately, asynchronous programs do not always have reproducible behavior. Even when they are run with the same inputs, their results can be radically different. In this talk I'll show you different approaches on how to debug asynchronous programs in Python.
TalkSoftware Engineering & Architecture
Game Development with CircuitPython
With a large selection of handheld devices running CircuitPython, it's natural to want to make games for them. But where to start? What are the options available for the hardware, the libraries and other resources? And how do you use all of that? This talk aims to give a gentle introduction for everyone.
TalkMakers
The Geometry of the Universe
A place to come and talk about the geometry of the universe.
Sagittarius A* and where is the Sun?
gamma ray bursts
gravitational waves.
What will James Webb see?
How to test different models?
Python, matplotlib, scipy, astropy
units and constants. Hubble and c
But maybe Hubble's not constant?
PosterPosters
Beyond the Basics: Data Visualization in Python
The human brain excels at finding patterns in visual representations, which is why data visualizations are essential to any analysis. Done right, they bridge the gap between those analyzing the data and those consuming the analysis. However, learning to create impactful, aesthetically-pleasing visualizations can often be challenging. This session will equip you with the skills to make customized visualizations for your data using Python.
While there are many plotting libraries to choose from, the prolific Matplotlib library is always a great place to start. Since various Python data science libraries utilize Matplotlib under the hood, familiarity with Matplotlib itself gives you the flexibility to fine tune the resulting visualizations (e.g., add annotations, animate, etc.). This session will also introduce interactive visualizations using HoloViz, which provides a higher-level plotting API capable of using Matplotlib and Bokeh (a Python library for generating interactive, JavaScript-powered visualizations) under the hood.
TutorialPyData: Software Packages & Jupyter
Norvig's lispy: beautiful and illuminating Python code
Why isn't if
a function? Why does Python need to add keywords from time to time? What precisely is a closure, what problem does it solve, and how does it work? These are some of the fundamental questions you'll be able to answer after this tutorial: an interactive exploration of Peter Norvig's lis.py
âan interpreter for a subset of the Scheme dialect of Lisp in 132 lines of Python.
TutorialPython Friends
Python Packaging Automation â Auto-Publish to PyPI via Pull Requests
A huge source of friction in software is publishing new releases. Somebody has to manually review commits and write a change-log, add a version number, and publish to PyPI. We will cover a better way: an automated process in which new versions are automatically published by merging pull requests.
TalkDevOps
Work in Progress: Implementing PEP 458 to Secure PyPI downloads
PEP 458 uses cryptographic signing on PyPI to protect Python packages against attackers. In this talk we will share our lessons learned from the ongoing implementation work in PyPI/Warehouse with the Python community. How does PEP 458 work and what is TUF? What protection can it offer now and what does it enable in the future? And how am I affected as a Python developer and as a user?
TalkSecurity
Applications of Python in Computational Chemistry and Material Design
Computational chemistry is the branch of chemistry that studies chemical systems through simulation and involves HPC architecture and software packages. Python has become an integral part of computational modelling of materials in recent years, with development of packages such as the Atomic Simulation Environment (ASE) which is a set of modules for manipulating, running and visualising atomic simulation. Furthermore, ASE integrates seamlessly with many electronic structure software packages, used for calculating the energy and properties of systems based on some level of theory, e.g Density Functional Theory (DFT). Moreover, the combination with other Python packages that integrate with ASE provide an ecosystem for atomic simulations. Packages such as CatLearn, a machine-learning approach used for calculating energies needed for reactions, along with Phonopy and FHI-vibes, both are for studying lattice dynamics of materials, to name a few, provide a comprehensive toolkit for the computational study of materials and chemical systems
In our research, such approaches are essential to further our understanding of materials and chemical processes, and of particular interest are materials for green and sustainable processes, such as catalysts used to produce fossil fuel alternatives. In this regard, as Python software becomes increasingly popular for the simulation and study of materials, it also provides the tools and methods needed for tackling some of the challenges of today
PosterPosters
Three Musketeers: Sherlock Holmes, Mathematics and Python
Mathematics is a science and one of the most important discoveries of the human race on earth. Math is everywhere and around us. It is in nature, music, sports, economics, engineering, and so on. In our daily life, we use mathematics knowingly and unknowingly. Many of us are unaware that forensic experts use mathematics to solve crime mysteries. In this workshop, we will explore how Sherlock Holmes, the famous fictional detective character created by Sir Arthur Conan Doyle uses Mathematics and Python programming language to solve crime mysteries. In short, the workshop begins with an introduction to forensic mathematics and covers basic principles thereby setting the stage. Then, we will solve simple crime puzzles using mathematics and simple python scripts. Finally, we will solve a few complex hypothetical crime mysteries using advanced python concepts. The participants will learn how to use the concepts of mathematics such as statistics, probability, trigonometry, and graph theory, and python and its packages such as SciPy, NumPy, and Matplotlib to solve the crime puzzles.
TutorialEducation, Teaching & Further Training
EModelRunner: a Python package to run online available biological neuron model implementations
The Blue Brain Project hosts several online portals from which users can download single neuron model implementations. These contain the information necessary to simulate the electrical behavior of a neuron. EModelRunner is a Python library that provides a unified interface to the users to run these downloaded models. It gives the users the ability to customize the properties of the neurons, apply various stimuli in order to observe the corresponding behavior, or activate the synapses that are present on the morphology of the neuron. This way neuroscientists can investigate the neurons in a self-contained environment and conduct digital experiments on them.
PosterPosters
Taking charge of your race conditions
Are you working with threads, processes, or more generally âworkersâ? And do you have blocks of code that must not be called concurrently? Maybe you didn't even realise it until your system experienced a bug you could not reproduce until the stars aligned. Then you surely know that hope is not the answer to a robust system, we must be prepared to face worst-case scenarios.
This talk will first briefly present race conditions, a staple in concurrent computing. We will then compare implicit and explicit concurrency management in your core logic, that is whether you delegate or craft protective logic yourself. Next comes testing, the real crux of the talk, where we will demonstrate how to manufacture a race condition. Finally we will explore how to solve such problems with the built-in tools the Python standard library offers.
TalkSoftware Engineering & Architecture
Scalpel: The Python Static Analysis Framework
As the most popular programming language nowadays, it has been pointed out that Python static code analysis has not yet received enough attention from the research and OSS community. For instance, to the best of our knowledge, there is no general static analysis framework proposed to facilitate the implementation of dedicated Python static analyzers (e.g., compared to the Java Soot/WALA framework).
Easy to use and fast to prototyping, what makes Python stand out is bringing challenges to static analysis tasks. To fill this gap, we design and implement Scalpel (A Python Static Analysis Framework) and make it publicly available as an open-source project. The Scalpel framework has already integrated a number of fundamental static analysis functions (e.g., call graph constructions, control-flow graph constructions, alias analysis, etc.) that are ready to be reused by developers to implement client applications focusing on statically resolving dedicated Python problems such as detecting bugs or fixing vulnerabilities. In addition, documentation and the user guide are provided for users.
The objective of the Scalpel framework is to (1) improve Python software quality and (2) support addressing research challenges (e.g. API studies) in software engineering research;
TalkSoftware Engineering & Architecture
Build-A-Database with Python
Databases are beautiful beasts built with several layers of abstractions that we rarely have to know or even care about, now it's time to break them open and discuss the different components that maketh a database and how one would go about building one if they wanted. This would be based on my learnings when I went about building my own Toy Database in Python
TalkSoftware Engineering & Architecture
Applying machine learning capabilities to wearable IoT devices for boxing technique management
IoT devices are increasing in power and capabilities, now allowing developers to deploy machine learning models on the device. This talk will analyse a boxing training session with motion sensors onboard multiple IoT devices using TinyML: a TensorFlow-based framework. Ultimately, these machine-learning powered IoT devices provide feedback to boxers on their technique.
TalkPyData: Machine Learning, Stats
How to craft awesome Machine Learning demos with Python
Building interactive Machine Learning demos is now easier than ever. With Open Source libraries such as Gradio and Streamlit, you can use Python to craft demos, and use Spaces to share them with the rest of the ML ecosystem as well as non-ML people. Learning to create graphic interfaces for models is extremely useful for sharing with other people interesting in them. All of this leverages free, open-source tools that anyone can use.
TalkPyData: Deep Learning, NLP, CV
My journey using Docker as a development tool
I have been programming in Python for 5 years and almost from day one I've been using Docker with Python. Docker is now a widely used tool across the industry, due to its flexibility. It can be used as a tool to help deploy your code in production, say using Kubernetes. It can also be used as a tool to help develop code locally, with tools such as docker-compose.
It has taken me some time to discover various features and best practices when using Docker. Especially when it comes to using it for local development.
In this talk, I would like to go over a journey I have taken with Docker whilst working with it over several years. Starting from a single build step with a full-fat image, going over multi-stage Docker images. Showing you how you can use the same Dockerfile for development and production.
TalkWeb
Diversity & Inclusion in the Python Community Panel
Come meet some of the folks working on Diversity and Inclusion in the global Python Community! Join the live panel discussion to hear about the challenges and the work they do.
With Marlene Mhangami, Nabanita Roy, Iqbal Abdullah, Tereza Iofciu. Chaired by Naomi Ceder.
PanelCommunity & Diversity
Event-driven microservices with Python and Apache Kafka
Implementing complex systems with microservices can be a great decision, but if weâre not careful we can end up with a distributed monolith. Letâs see how to avoid that by building lightweight, loosely coupled microservices using Python, Flask, and Apache Kafka.
TalkSoftware Engineering & Architecture
Native Packaging of GUI Apps on Windows and macOS
Distributing Python GUI applications to end users is a challenge: will they need to install Python? If so, which version? If not, how do they install the application? From a random ZIP file? How native does the process feel? Will their system trust your code? For a fluid experience, it needs to be signed and (on macOS) notarized beforehand.
Welcome to pup
, the tool that the Mu Editor development team has created to package and distribute it in platform-native formats to Windows and macOS users around the world.
In this session I will show how pup
can be used to package GUI Applications for distribution: natively on Windows and macOS, and in early stages of development for distribution-agnostic Linux artifacts. In short, if it's pip
-installable it is pup
-packageable!
I will then describe the way pup
works (and how it differs from comparable tools) leading on to a call-for-action moment, where I'll share its current state of development, what's good, what's bad, and where I'd like it to be headed to.
I'll wrap up the talk with a set of future-looking thoughts that pup
has helped identify not only on the specifics of CPython's distribution, but also on the Python ecosystem as whole.
Talk~None of the above
Unfolding the paper windmills
Research is done on the shoulders of giants. Luckily and unluckily, those giants spoke paper-English and documented their achievements kind of publicly so we could advance the science.
In this talk, we will dissect the structure of a paper, looking for the essential points that will help us understand it and implement it. Following we will get our hands dirty and implement the paper using Python. In particular, we will dive into the seminal paper "Attention is all you need" and implement a transformer using JAX.
The key takeaways from this talk are:
- Demystify academic reading.
- Understand the Transformer architecture.
- An introduction to the JAX ecosystem.
TalkPyData: Deep Learning, NLP, CV
Raise better errors with Exception Groups
New to python 3.11, Exception Groups help you raise and handle errors more robustly than ever before - you will delve deep into the current gaps in python's exception handling mechanisms, and get to know Error Groups, and a new python keyword except*, that can be used to overcome those issues and to write cleaner code.
TalkSoftware Engineering & Architecture
Dr. Jekyll & Mr. Hyde - transition from developer to manager without going crazy or becoming evil
In the career of many developers, there comes the point of deciding "what next?". The typical two choices are- to stay on the technical path and pursue the way of a software architect or take a leap of faith and jump to a people management role. In my talk, I'll show you the pros, cons, and challenges of pursuing the latter.
TalkCareer, Life,...
Bulletproof Python â Property-Based Testing with Hypothesis
Property-based testing is a great benefit to the robustness and maintainability of your software. Yet, the technique is still vastly underused in the Python community. The workshop gives a hands-on introduction to Hypothesis and practices different approaches for writing property-based tests.
TutorialTesting
Is the news media polarized? Or are we being conditioned to think it is?
In this talk, we aim to find if polarization is induced in a neural network by feeding it newspaper articles with manufactured sentiments according to the Allsides Media Bias chart for the level of faith people on various aisles of the political spectrum. This project consists of a set of experiments on similar data-sets from news agencies across the various subsets in the âmedia-biasâ chart. News Media perceived bias is common across consumers that belong to various political affiliations. While anecdotal evidence of this exists and there exist annotated datasets that aim to annotate the âspinâ a news agency puts on certain events and entities, whether this is a widespread problem and whether it can be detected by the neural network topically or temporally is a problem that needs to be explored. The news media bias analysis is modelled as a Natural Language Processing sentiment analysis task and a fake news binary classification task to deduce the level of polarization in a neural network by feeding it headlines embedded using pre-trained sentiment models from news publications across the political spectrum. When it came to fake news vulnerability, news from all kinds of perceived politically affiliated news media holds up well against a fake news dataset with a very good accuracy. None of the accuracies dropped below 95%. This is a significant result that sort of debunks the AllSlides categorization
TalkPyData: Deep Learning, NLP, CV
Friday Lightning Talks
A lightning talk (LT) is a short presentation that must not be longer than five minutes.
To sign up for a lightning talk, you can put your name on the information board during the conference before the second coffee break. For our online participants, we will set up a separate form or Google sheet for you to put your name and topic in - similar to how we run this at the in-person conference.
We will announce the same every day both online and in person.
Lightning Talk~None of the above
On the benefits of using workflows: insights from two software tools in the context of computational neuroscience
The Blue Brain Project strives to simulate the whole mouse brain. The amount of data and code this implies is astoundingly high, and it requires the development of software tools that have both a strict and clear structure and that are resilient to errors that will manifest when developing complex code. Workflows are a straightforward way to maintain structure in toolchains that grow increasingly complex. Workflow management packages such as Luigi bring functionality to run different tasks in parallel, keep track of completed tasks and improve the reproducibility. This poster will present two Blue Brain Project software tools, the e-model-packages software and BluePyEModel, focusing on the creation and distribution of in-silico neuron cells. The e-model-packages software collects cells from an in-silico brain circuit and arranges them in individual âneuron packagesâ to be distributed to the public through the Blue Brain online portals. The cells packages it creates are designed to be easily run with the open source EModelRunner package. The BluePyEModel software creates and optimizes in-silico neurons and is able to reproduce features from real neuronal experiment recordings. Under the hood, it uses the open source BluePyEfe and eFel packages to extract the electrophysiological features from experimental cells, and the open source BluePyOpt simulator to optimize and validate the parameters of the in-silico neurons.
PosterPosters
Education Panel
Teaching Python: a panel discussion with perspectives from teachers, academics, makers and enthusiasts. Why is Python appealing in education? What tools and resources work well? What can the Python community do to help teachers & policy makers? Join us for an engaging and insightful discussion with a fascinating panel of experts, Dr Keith Quille, Kelly Schuster-Paredes, Chris Reina and Sarah-Jayne Carey.
PanelEducation, Teaching & Further Training
A Personal Brand? Surprise, you already have one!
Why should you care about your personal brand? After all, itâs not like you are an actor or the lead singer for a rock band. In fact, itâs never been more important for you to think about yourself as a brand. Doing so will provide rocket fuel for your career. Youâll find better jobs and become a âthought leaderâ in your industry. Youâll become known for your expertise and leadership; people will seek your advice and point of view. As a developer, there are many tools you can use to build a personal brand, and this presentation will help you learn how to get visibility, make a real impact, and achieve your goals. You donât need to be a marketing expert or a personal branding guruâ you can be yourself and get your dream job or reach the next level of your career.
PosterPosters
Open Science: Building Models LIke We Build Open-Source Software
The use of transfer learning has begun a golden era in applications of Machine Learning but the development of these models âdemocraticallyâ is still in the dark ages compared to best practices in Software Engineering. I describe how methods of open-source software development can allow models to be built by a distributed community of researchers.
TalkPyData: Ethics in AI
When Models Query Models
The design of large-scale engineering systems, including but not limited to aerospace, particle accelerators, nuclear power plants, is carried out by a wide range of numerical models such as CAD files, finite-element models, and machine learning surrogate models to name a few. In order to provide a uniform modelling interface, we encapsulate numerical models in notebooks. A notebook is controlling model creation, execution, and query of results. Numerical solvers are embedded into Docker containers and provide an isolated and reproducible environment exposing a language-agnostic REST API. A model registry enables efficient queries of models. The overall system is represented as a collection of models that exchange data. Then, the design optimization involves execution of a dependency tree of models to study the impact of a parameter change and perform its optimization. In this contribution, we present a model query mechanism allowing notebook models to query one another. The model dependencies are represented with a graph with suitable processing algorithms. In order to ensure that only affected models are executed we derive and cache a model resolution order. The presented modelling framework relies on open source-technologies (packages: pydantic, Fast API, Jupyter, papermill, scrapbook, containers: Docker and Openshift as well as databases: MongoDB and Redis) and the talk will focus on good practices and design decisions encountered in the process.
TalkPyData: Software Packages & Jupyter
ShapePipe: A modular weak-lensing processing and analysis pipeline
I will the present the first public release of ShapePipe, an open-source and modular weak-lensing measurement, analysis and validation pipeline written in Python. I will begin by giving an (easy-to-follow) introduction on how and why we measure the shapes of galaxies to map the distribution of dark matter in the Universe. I will then describe the design of the software, mentioning the numerous Python packages we used, and justify the choices we made. I will conclude by discussing some of the lessons we learned along the way and how these can be applied to other scientific software development projects.
Talk~None of the above
Elephants, ibises and a more Pythonic way to work with databases
In this talk, I will be sharing about Ibis, a software package that provides a more Pythonic way of interacting with multiple database engines. In my own adventures living in Zimbabwe, Iâve always encountered ibises (the bird versions) perched on top of elephants. If youâve never seen an elephant in real life I can confirm that they are huge, complex creatures. The image of a small bird sitting on top of a large elephant serves as a metaphor for how ibis (the package) provides a less complex, more performant way for Pythonistas to interact with multiple big data engines.
I'll use the metaphor of elephants and ibises to show how this package can make a data workflow more Pythonic. The Zen of Python lets us know that simple is better than complex. The bigger and more complex your data, the more of an argument there is to use Ibis. Raw SQL can be quite difficult to maintain when your queries are very complex. For Python programmers, Ibis offers a way to write SQL in Python that allows for unit-testing, composability, and abstraction over specific query engines (e.g.BigQuery)! You can carry out joins, filters, and other operations on your data in a familiar, Pandas-like syntax. Overall, using Ibis simplifies your workflows, makes you more productive, and keeps your code readable.
TalkPyData: Software Packages & Jupyter
Lint All the Things!
Code thatâs uniform is easier to read, write, and debug, but writing down your standards and conventions in a README that no one reads isnât enough. The explosion of CI and linter tools allow you to no only document your standards and conventions, but make sure people actually adhere to them.
TalkPython Libraries
Sponsor Recruitment Session
Many of our sponsors are looking to hire talented people and EuroPython is the perfect place to reach out to them!
In this session, our sponsors will each give a short presentation about their company and what they do with Python. You can then approach them directly at their booth to discuss more details.
TalkSponsor
Rapid prototyping in BBC News with Python and AWS
BBC News Labs is an innovation team within BBC R&D, working with journalists and production teams to build prototypes to demonstrate and trial new ideas for ways to help journalists or bring new experiences to audiences.
Working in short project cycles, it's important for us to be able to quickly build processing pipelines connected to BBC services, test and iterate on ideas and demonstrate working prototypes. We make use of modern cloud technologies to accelerate delivery and reduce friction.
In this talk I will share our ways of working, our ideation and research methods, and the tools we use to be able to build, deploy and iterate quickly, the BBC's cloud deployment platform, and our use of serverless AWS services such as Lambda, Step Functions and Serverless Postgres.
TalkSoftware Engineering & Architecture
How I wrote a Python client for HTTP/3 proxies
MASQUE (Multiplexed Application Substrate over QUIC Encryption) is a draft of a new protocol that allows running proxy or VPN services indistinguishable from HTTPS servers. Akamai built a managed proxy service based on the MASQUE protocol to provide egress proxy for iCloud Private Relay.
While working on the proxy at Akamai, I wrote a Python client for testing the proxy service. The MASQUE protocol can tunnel traffic through HTTP/3 or HTTP/2, but common Python libraries only support HTTP/1.1. The tunneled traffic can use any protocol on top of TCP or UDP, including all HTTP versions, so MASQUE can be proxied through MASQUE for onion routing.
In this talk, I will show that the MASQUE proxy design is simple and yet client implementations are complex. To put everything into context, I will recap how HTTP proxies operate and how HTTP versions differ. I will highlight lessons learned from designing a low-level HTTP client using Python asyncio.
TalkWeb
Data Validation for Data Science
Have you ever worked really hard on choosing the best algorithm, tuned the parameters to perfection, and built awesome feature engineering methods only to have everything break because of a null value? Then this tutorial is for you! Data validation is often neglected in the process of working on data science projects. In this tutorial, we will demonstrate the importance of implementing data validation for data science in commercial, open-source, and even hobby projects. We will then dive into some of the open-source tools available for validating data in Python and learn how to use them so that edge cases will never break our models. The open-source Python community will come to our help and we will explore wonderful packages such as Pydantic for defining data models, Pandera for complementing the use of Pandas, and Great Expectations for diving deep into the data. This tutorial will benefit anyone working on data projects in Python who want to learn about data validation. Some Python programming experience and understanding of data science are required. The examples used and the context of the discussion is around data science, but the knowledge can be implemented in any Python oriented project.
TutorialPyData: Data Engineering
Using python to predict Asset price reversals.
Using Pandas, Python and Plotly to locate potential trend reversals in Stocks, Crypto or any OHLC feed. Learn how to locate Fibonacci retrace levels and predict price reversal zones for the lowest risk entry to a trade.
Sponsored TalkPyData: Data Engineering
From circuit board design to finished product: the hobbyistâs guide to hardware manufacturing
Ever wondered how hardware is made, or curious about making your own?
We share our experiences manufacturing a programmable gamepad for use in IoT/MicroPython workshops.
We will cover the entire production process, including:
- Designing the PCB (Printed Circuit Board)
- Choosing microcontroller and parts
- Finding, ordering and assembling components
- Pulling together firmware, drivers and software
Mistakes were indeed made along the way. Let's turn them into valuable lessons!
TalkMakers
Clean Architectures in Python
A brief talk that introduces software developers to the idea of "clean architecture" and discusses how to reduce coupling between parts of a software system through well-known strategies such as abstraction and inversion of control.
TalkSoftware Engineering & Architecture
HPy: a better C API for Python
The official Python C API is specific to the current implementation of CPython. It has served us well and forms the basis upon which our entire extension ecosystem rests. However, it exposes a lot of internal details which makes it hard to implement it for other Python implementations (e.g. PyPy, GraalPython, Jython, IronPython, etc.), and prevents major evolutions of CPython itself, such as using a GC instead of refcounting, or removing the GIL.
This is where HPy comes in. It's a new C API designed from the ground up according to the following goals:
- running much faster on alternate implementations, and at native speed on CPython
- making it possible to compile a single binary which runs unmodified on all supported Python implementations and versions
- being simpler and more manageable than the Python/C API
- providing an improved debugging experience.
We'll discuss its current status and show how existing extensions can be gradually ported to it.
Talk(c)Python Internals
Automated Refactoring Large Python Codebases
Like many companies with multi-million-line Python codebases, Carta has struggled to adopt best practices like Black formatting and type annotation. The extra work needed to do the right thing competes with the almost overwhelming need for new development, and unclear code ownership and lack of insight into the size and scope of type problems add to the burden. Weâve greatly mitigated these problems by building an automated refactoring pipeline that applies Black formatting and backfills missing types via incremental Github pull requests. Our refactor applications use LibCST and MonkeyType to modify the Python syntax tree and use GitPython/PyGithub to create and manage pull requests. It divides changes into small, easily reviewed pull requests and assigns appropriate code owners to review them. After creating and merging more than 3,000 pull requests, we have fully converted our large codebase to Black format and have added type annotations to more than 50,000 functions. In this talk, youâll learn to use LibCST to build automated refactoring tools that fix general Python code quality issues at scale and how to use GitPython/PyGithub to automate the code review process. Slides: https://www.slideshare.net/jimmy_lai/europython-2022-automated-refactoring-large-python-codebases
TalkSoftware Engineering & Architecture
Django Girls Workshop
If you are a female and you want to learn how to make websites, we have good news for you! We are holding a one-day Django workshop for beginners! Applications open for registration till Sat 2 July. We have space only for 30 people, so make sure to fill the form very carefully!
Special Workshop~None of the above
Scaling scikit-learn: introducing new sets of computational routines
For more than 10 years, scikit-learn has been bringing machine learning and data science methods to the world. Since then, the library always aimed to deliver quality implementations, focusing on a clear and accessible code-base built on top of the PyData ecosystem.
This talk aims at explaining the recent on-going work of the scikit-learn developers to boost the native performances of the library.
TalkPyData: Software Packages & Jupyter
Opening Session [in-person & remote]
Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session Opening Session
Opening Session~None of the above
`typing.Protocol`: type hints as Guido intended
If your type-hinted Python code is Java flavored, you're probably underusing typing.Protocol
. Python is literally built on structural typing, a.k.a. duck typing. It's how __special__
methods work. Type hints were introduced in Python 3.5 without support for duck typing, but it was added in Python 3.8 and we should all be using typing.Protocol
to have our code statically checked and Pythonic.
TalkSoftware Engineering & Architecture
Using Python to manage Software Bill of Materials
Software has become increasingly complex as it is constructed from a multitude of software components. In many cases the identification of these components are hidden as they are included through implicit dependencies. Without fully understanding the dependencies of your product it is not possible to understand the current vulnerability status of your software product or system. In the past 12 months, there has been an increasing focus on the use Software Bill of Materials (SBOMs) as a key artefact to be delivered with a software product; it will be mandated for all software products in some markets later in 2022. SBOMs which were initially developed to capture the inter-dependencies between components (the focus was on capturing the different types of open source licences used within a product) but with the latest evolution, tracking of vulnerabilities within a product can now be performed.
This talk will introduce the SBOM concept and show how Python and its ecosystem can be used to create, manage and use SBOMs as part of your development pipeline.
TalkDevOps
Making AI Happen at Your Company
All one needs is strategy, skill and resources to make digitalization and AI happen. So why is everything taking so long? Shouldnât you all be finished yesterday already? An honest talk about how to address the complexity of making AI happen in enterprises.
Talk~None of the above
Walk-through of Django internals
â The talk will cover the Django codebase internals and showcase various moving parts in the code.
â Talk will cover the internals of CGI, WSGI, working on runserver, views, Middleware, app loading, Django settings load, ORM, Django utilities, etc.
TalkDjango
Packaging Python in 2022
Packaging in Python is one place where the common adage "There should be one and preferably only one obvious way to do it" doesn't seem to apply. There are a lot of choices to make when publishing python code. What is absolutely essential and what is optional?
Talk~None of the above
Robyn: An async Python web framework with a Rust runtime
Python web frameworks, like FastAPI, Flask, Quartz, Tornado, and Twisted, are important for writing high-performance web applications and for their contributions to the web ecosystem. However, even they posit some bottlenecks either due to their synchronous nature or due to the usage of python runtime. Most of them donât have the ability to speed themselves due to their dependence on *SGIs. This is where Robyn comes in. Robyn tries to achieve near-native Rust throughput along with the benefit of writing code in Python. In this talk, we will learn more about Robyn. From what is Robyn to the development in Robyn.
TalkWeb
Registration @ Ground Foyer
Please show up on time for registration and bring your E-ticket along with the order code in order for a speedy registration. We'll require you to present a form of ID document (ex: passport) to verify your identity.
Don't be late :)
Registration
Handling Errors the Graceful Way in Python
Things rarely go as planned, especially in the world of programming. Errors are the bane of a programmerâs existence. You write an awesome piece of code, are ready to execute it and build a powerful machine learning model, and then poof. Python throws up an unexpected error, ending your hope of quick code execution.
TalkWeb
Async Django
Python has a full set of tools for asynchronous programming - multiprocessing, multithreading, coroutines, etc. And Django uses most of them.
Since Django 3, we have the ability to create fully async non-blocking Django views that could handle thousands of requests concurrently.
In this talk, we'll focus on 2 key topics:
- The motivation and the decisions behind the Django async support
- Choosing the right tools to make our views async and efficient
TalkDjango
Writing Faster Python 3
Did you know that Python preallocates integers from -5 to 257? Reusing them 1000 times, instead of allocating memory for a bigger integer, can save you a couple milliseconds of codeâs execution time. If you want to learn more about this kind of optimizations then, ⊠well, probably this presentation is not for you :) Instead of going into such small details, I will talk about more âsaneâ ideas for writing faster code.
After a brief overview of different levels of optimization and how they work in Python, I will show you simple and fast ways of measuring the execution time of your code and finally, discuss examples of how some code structures could be improved.
You will see:
- The fastest way of removing duplicates from a list
- How much faster your code is when you reuse the built-in functions instead of trying to reinvent the wheel
- What is faster than the âfor loopâ
- If the lookup is faster in a list or a set
- When itâs better to beg for forgiveness than to ask for permission
TalkSoftware Engineering & Architecture
CPython Developer Panel
Come meet the folks who make the Python programming language!
A panel discussion of core Python developers will take place on Wednesday at 2pm. Hear what's on their mind, what they're working on and what the future holds for Python.
With Pablo Galindo Salgado, Steve Dower, Batuhan Taskaya, Ken Jin, Irit Katriel and Dr.Mark "HotPy" Shannon. Chaired by the esteemed Ćukasz "Any color you like so long as it's black" Langa.
Panel(c)Python Internals
Pew Pew Workshop
Join Radomir Dopieralski, creator of Pew Pew Games Console, and learn how to programme this game console with CircuitPython. Attendees of the Pew Pew workshop will receive a Pew Pew games console to take home with them after the workshop.
The Pew Pew workshop is free for EuroPython conference or training ticket holders, however, spaces are limited.
Special WorkshopEvents
Killer Robots Considered Harmful
Killer robots may sound like something from a movie, but in recent years weapons have been developed that can select targets and attack without any human input, and expert systems have been used to assist in military targeting.
Some argue that this is a positive development, because automation can increase precision in targeting and reduce civilian casualties. However, others point out that highly automated systems do not have a good track record in complex and high-stakes real world situations, and military conflict is unlikely to be better.
This talk will outline the technological underpinnings of autonomous weapons and automated targeting systems, as well as examining the legal and ethical debate over these systems that has been happening at the UN over the past decade.
KeynoteKeynotes
Beginners' Day - Humble Data
Would you like to learn to code but donât know where to start? Taking your first steps in programming can seem like an impossible task so weâve decided to put on a workshop to show 30 beginners how it can be done and share our passion for the world of data science.
Special WorkshopEvents
Revolutionizing Education: How Python is Essential Beyond Computer Science
Python has had a transformational effect on countless fields so far, but its permeation can be accelerated through the integration of Python into non-computing coursework. Currently, Pythonâs presence within secondary and post-secondary schools varies greatly between different institutions, but the continuity in the lack of interdisciplinary coursework is a key limiting factor in the widespread growth of computing education. This is due to a variety of factors, including stereotypes and policy issues, but the bottom line is that Python being restricted to only computing classes restricts career opportunities and misrepresents the professional world. With support from a case-study of college-level physics students exposed to scientific programming, we propose novel methods of integrating Python into traditional coursework during this talk. The overarching mission of this discussion is to demonstrate how Python literacy in non-computing coursework can ultimately help in streamlining processes and accelerating progress in various industries. Attendees will have the opportunity to hear about the exciting prospect of expanding Python beyond the confines of computer science, and will have an exclusive look at a case-study that offers insight into student benefits of integrated coursework: as concerned stakeholders, it is ultimately well-informed Python community members who must unite to make a positive impact on the education system.
TalkEducation, Teaching & Further Training
A Network Embeddings based Recommendation Model with multi-factor consideration
Recommendation systems are increasingly in demand to provide a personalized customer experience for diversified product mix offerings. Traditionally we use interaction information based on user preferences and item characteristics. This brings in collaborative filtering-driven recommendations with higher accuracy and relevance. However, such a method has certain limitations in utilizing implicit information like cross-domain specific factors that are equally important for making personalized recommendations. We propose an improvised way of using network embeddings based matrix factorization technique with multi-factors to make a match between both implicit and explicit features resulting in more accurate recommendation.
TalkPyData: Machine Learning, Stats
Forget Mono vs. Multi-Repo - Building Centralized Git Workflows with Python
The mono vs. multi-repo is an age-old debate in the DevOpsphere, and one that can still cause flame wars. What if I were to tell you that you don't have to choose? In this talk we will dive into how we built a centralized Git workflow that can work with any kind of repo architecture, delivered with Python.
One of the greatest recurring pains in CI/CD is the need to reinvent the wheel and define your CI workflow for each and every repository or (micro)service, when eventually 99% of the config is the same. What if we could hard reset this paradigm and create a single, unified workflow that is shared by all of our repos and microservices? In this talk, we will showcase how a simple solution implemented in Python, demoed on Github as the SCM, and Github Actions for our CI, enabled us to unify this process for all of our services, and improve our CI/CD velocity by orders of magnitude.
TalkDevOps
Code coverage through unit tests running in sub-processes/threads: Locally and automated on GitHub
Unit testing and code coverage are two essential aspects of an open-source codebase. These unit tests often run in spawned sub-processes or threads as sub-processes or multi-threading allow them to run parallelly. They also make it easier to stop the tests midway if the process is taking too much time (probabilistic tests).
However, running unit tests in a sub-process creates a problem in the local repository as well as in the remote repository. As the documentation of coverage.py
says â âMeasuring coverage in those sub-processes can be tricky because you have to modify the code spawning the process to invoke coverage.py.â
TalkDevOps
Super Search with OpenSearch and Python
OpenSearch is an open source and free document database with search and aggregation superpowers, based on Elasticsearch. This session covers how to use OpenSearch to perform both simple and advanced searches on semi-structured data such as a product database.
TalkPyData: Data Engineering
Jupyter - Under the Hood
Jupyter Notebooks at their core are just JSON documents that contain all your code, markdown styles and outputs. Yet when you run a notebook, there's a lot that's happening under the hood - from starting a session with the notebook server, to launching an IPython kernel, and a rich Web UI communicating with the notebook server and the IPython kernel using Jupyter's REST APIs and ZMQ websockets. We will explore the Jupyter ecosystem (Jupyter, JupyterLab, JupyterHub) and see how this system comes together.
TalkPyData: Software Packages & Jupyter
Choosing the right database for your next project - Looking at options beyond PostgreSQL and MySQL
In the last few years, lots of new database engines have been developed, making the selection process even more challenging than it was before, if you want to maintain an edge.
The talk will give an overview of what to consider in different situations.
TalkPyData: Data Engineering
Best practices to open source a product and creating a community around it
In certain areas of the industry open source has become mainstream, whether it be a small part of a product, a âcommunity edition of a productâ, or creating a whole business around an open source product. One could assume the only thing required to do so is to make the source code of the project publicly accessible, possibly by putting it on a platform such as GitLab or GitHub, and one couldnât be more wrong.
In this talk we explore those aspects such as the licence and the governance of the project and the impact they can have. Then we talk about common mistakes teams make which create an environment where outsiders donât necessarily feel welcomed to the project. First impressions matter and itâs important that new contributors and users stay once they encounter the project.
TalkCommunity & Diversity
Property-based testing the Python way
What if I told you you could write simpler tests but still get better results ?
What if I told you can automatically generate your test data ?
This may sound difficult to your traditional testing approach but can be easily done with property-based testing.
Property-based testing allows a range of inputs to be tested on a given aspect of a software property, abstracting away the details.
In the world of Python you can accomplish this with Hypothesis, the Python library used for property-based testing.
Hypothesis helps you design cleaner and clever test suites.
TalkTesting
Correlating messy data with "correlate"
An introduction to the correlate Python library. You tell correlate about two datasets that should map to each other, and it determines the best matches for you. The novel scoring algorithm at the heart of correlate means it copes exceedingly well with messy real-world data. correlate supports fuzzy matching, weighted matching, and ordering.
TalkPython Libraries
What happens when you import a module?
It's a rare program that doesn't include at least one "import" statement. But what actually happens when we import a module? How does Python find our file, decide whether to load it, and then keep track of it in memory? In this talk, I'll walk you through what happens when you "import" a module into Python, revealing the complexities of something seemingly simple that we use every day.
Talk(c)Python Internals
The beginnerâs data science project checklist
In this talk, I will give you tips and practical advice on what steps to follow and how to plan your data science project to avoid making the most common mistakes during its development.
Despite my limited experience delivering data science projects, I have learned how to avoid certain mistakes. In this talk I will teach you how to prevent them and save you lots of headaches.
TalkPyData: Machine Learning, Stats