ChiPy __Main__ Meeting April 2021
When: April 8, 2021, 6 p.m.
Where: Remote Meeting
Attendance:
Virtual Pythonistas: 0
Topics
-
Dangers of the Python Standard Library
By: Andrew Scott
Experience Level: Novice
Length: 15 Minutes
Description:As a developer, you are the first line when it comes to security for any products you may be building. There is often a misconception that all software security vulnerabilities are due to misconfigurations, using unmaintained open source libraries, using "insecure" languages, or by making dumb mistakes like hard-coding passwords. In actuality, it can be very easy to make potentially extreme security mistakes even only using built-in functions and libraries bundled with the latest version of Python. This talk will cover a number of these potential security mistakes that can be all too easy to make.
-
Getting up to speed with Dask
By: Aaron Richter
Experience Level: Intermediate
Length: 45 Minutes
Description:Dask is a parallel computing library for Python people. This talk will be a gentle introduction to Dask, showing how you can improve the speed of data science code on your laptop with a simple "pip install". Then we will use the same code to process big data on a cluster of machines. We will be going through an end-to-end data science pipeline, from ETL and exploratory analysis to machine learning model training and scoring.
We will cover:
- Example using publicly available data and single-node Python
- Pandas for data cleaning/transformation
- Scikit-learn for machine learning
- How to parallelize this workflow on a laptop and then a cluster using Dask
- Distributed model training
- Distributed inference/scoring