Chicago Python User Group

Python at Zoro By: Joe Neylon

Date: Nov. 11, 2021, 6:15 p.m.

Zoro is an online distributor of products for B2B customers, focused on helping small businesses easily find what they need to grow and maintain their businesses. Today, we have over eight million products available—and that number is expected to keep growing. We work with third-party suppliers to provide products and fulfill orders for our customers.

Zoro uses Python with Django for its ecomerce site, as well as for data science, ETLs, and microservices.

Using Python to Accelerate Data Science at Nielsen By: Jordan Bettis

Date: Nov. 11, 2021, 7:15 p.m.

Nielsen is a global leader in audience insights, data and analytics, shaping the future of media. Nielsen uses Python to bridge the gap between model development, validation and deployment into production data pipelines to accelerate creation and evolution of analytics products.

Speeding up builds with Asynchronous Tests By: Meygha Machado

Date: Oct. 14, 2021, 6 p.m.

Automated tests are a great way to iterate fast and ensure features didn't break. This talk discusses how to speed up your builds and dev cycle even more by running tests asynchronously using a pytest plugin called asyncio-cooperative.

Financial Dashboard on Streamlit By: Shashank Katyayan

Date: Oct. 14, 2021, 6 p.m.

Easy to build Python Dashboards using Financial data APIs

Production-ready Machine Learning By: Zax Rosenberg

Date: Aug. 12, 2021, 6 p.m.

Building machine learning (ML) models is faster and easier now than ever before. The proliferation of open-source libraries means data scientists can leverage cutting-edge pre-trained models in just a few lines of code. Yet it remains true that most ML models never make it to production. Why? Because making it to production (and staying in production) are about more than just model and code quality. In particular, this talk will discuss how MLOps can greatly accelerate and increase the chances of model success.

Specifically, the talk will walk through the full ML lifecycle and answer: What is MLOps? Why is it important? How can MLOps infrastructure be set up quickly, easily, and with open source tools? How can the system be designed in a user-friendly way, but without too much magic? How can user adoption be accelerated?

While its expected that data-science-related professionals will garner the most value from this talk, no prior MLOps/ML background is required to understand the contents of the talk.

ANALYSIS AND APPLICATION OF DATA SCIENCE AND NLP IN DEVELOPING HR INSIGHTS By: Manaswita Tyagi

Date: Aug. 12, 2021, 6 p.m.

In Today’s world, AI has become an essential tool for achieving and creating the unthinkable. It is helping in creating innovative solutions for almost every industry there is. In the wake of this ever-growing demand for computerized intelligence, what constitutes an active research domain is how AI-based intelligence can be interpreted and utilized by HR (Human Resources) from predictive analysis to automation. As the HR department is solely responsible for recruiting and bringing valuable talent to the industry, it becomes essential that this task is done with maximum efficiency. Through this project, we intend to predict which employee would prefer a job change and which employee would stay in a company and help assess the input resources required to put in an employee. This presentation will take you through the principles of using python, opinion mining, and various widely used classifiers, namely Random Forest (RF), Cat Boost Classifier, Support Vector Machine (SVM), and Naïve Bayes (NB).

Managing the Test Data Nightmare By: Andrew Knight

Date: July 8, 2021, 6 p.m.

Test data for automated tests can be a nightmare to manage. Data must be prepped in advance, loaded before testing, and cleaned up afterwards. Sometimes, teams don't have much control over the data in their systems under test—it's just dropped in, and it can change arbitrarily. Hard-coding values into tests that reference system tests can make the tests brittle, especially when running tests in different environments. In this talk, I'll teach strategies for managing each type of test data: test case variations, test control inputs, config metadata, and product state. We will cover how to "discover" test data instead of hard-coding it, how to pass inputs into automation (including secrets like passwords), and how to manage data in the system. After this talk, you will wake up from the nightmare and handle test data cleanly and efficiently like a pro!

Bootstrapping your Local Python Environment By: Calvin Hendryx-Parker

Date: July 8, 2021, 6 p.m.

Slides Link

You cracked open your brand new Mac or Linux dream machine and low and behold, it has Python out-of-the-box and ready to roll… Or so you think? Maybe you want to get started doing Python development on Windows and see that you can grab Python easily from the Microsoft Store. Should you? Let’s talk about getting started with the end in mind and making sure your development computer doesn’t become the next superfund site https://xkcd.com/1987/. We’ll quickly go through a tour of the various options such as pyenv, venv, virtualenv, conda and Docker as great ways to make sure you can develop in a sane environment.

Anvil: Full Stack Web with Nothing but Python By: Meredydd Luff

Date: June 10, 2021, 6 p.m.

Building a modern web app requires so much: HTML, CSS, JS, Python, SQL, React, Bootstrap, Webpack, Django... What if we could build a better abstraction?

In this talk, I'll introduce Anvil, a full-stack Python environment where everything is a Python object, from your UI components to your database rows. I'll walk you through how and why we constructed this new approach to the web.

We'll start with a question: Why is web programming hard? It's because your data takes so many forms: database rows, Python objects, JSON on REST, JS objects, HTML DOM, and finally pixels. Most of a web developer's job is translating between these awkwardly different representations. Frameworks like Django help, but now you have a stack of leaky abstractions: web frameworks, ORMs, JS frameworks, CSS frameworks, build tools... These frameworks help you go faster, but they double the amount you need to know!

So I'll show our stab at an answer: A framework where everything is a Python object, requests to the server are function calls, and Python is a browser-side language. I'll talk about running Python in the browser. I'll talk about full-stack autocompletion. There will even be live coding.

Getting up to speed with Dask By: Aaron Richter

Date: April 8, 2021, 6 p.m.

Dask is a parallel computing library for Python people. This talk will be a gentle introduction to Dask, showing how you can improve the speed of data science code on your laptop with a simple "pip install". Then we will use the same code to process big data on a cluster of machines. We will be going through an end-to-end data science pipeline, from ETL and exploratory analysis to machine learning model training and scoring.

We will cover:
- Example using publicly available data and single-node Python
- Pandas for data cleaning/transformation
- Scikit-learn for machine learning
- How to parallelize this workflow on a laptop and then a cluster using Dask
- Distributed model training
- Distributed inference/scoring