ChiPy __Main__ Meeting April 2021


When: April 8, 2021, 6 p.m.

Where: Remote Meeting

Attendance:
Virtual Pythonistas: 0

Topics


  • Dangers of the Python Standard Library
    By: Andrew Scott
    Experience Level: Novice
    Length: 15 Minutes
    Description:

    As a developer, you are the first line when it comes to security for any products you may be building. There is often a misconception that all software security vulnerabilities are due to misconfigurations, using unmaintained open source libraries, using "insecure" languages, or by making dumb mistakes like hard-coding passwords. In actuality, it can be very easy to make potentially extreme security mistakes even only using built-in functions and libraries bundled with the latest version of Python. This talk will cover a number of these potential security mistakes that can be all too easy to make.

  • Getting up to speed with Dask
    By: Aaron Richter
    Experience Level: Intermediate
    Length: 45 Minutes
    Description:

    Dask is a parallel computing library for Python people. This talk will be a gentle introduction to Dask, showing how you can improve the speed of data science code on your laptop with a simple "pip install". Then we will use the same code to process big data on a cluster of machines. We will be going through an end-to-end data science pipeline, from ETL and exploratory analysis to machine learning model training and scoring.

    We will cover:
    - Example using publicly available data and single-node Python
    - Pandas for data cleaning/transformation
    - Scikit-learn for machine learning
    - How to parallelize this workflow on a laptop and then a cluster using Dask
    - Distributed model training
    - Distributed inference/scoring