ChiPy __Main__ Meeting May 2021: Thu, May 13 2021 at 06:00 PM at Remote Meeting
Thu, May 06 2021 at 06:00 PM at Remote Meeting (Gather Town)
Thu, Apr 15 2021 at 06:00 PM at Remote Project Night (chipy.town)
ChiPy __Main__ Meeting April 2021: Thu, Apr 08 2021 at 06:00 PM at Remote Meeting
(15 Minutes)
By: Andrew Scott
Experience Level: Novice
Slides Link
As a developer, you are the first line when it comes to security for any products you may be building. There is often a misconception that all software security vulnerabilities are due to misconfigurations, using unmaintained open source libraries, using "insecure" languages, or by making dumb mistakes like hard-coding passwords. In actuality, it can be very easy to make potentially extreme security mistakes even only using built-in functions and libraries bundled with the latest version of Python. This talk will cover a number of these potential security mistakes that can be all too easy to make.
(45 Minutes)
By: Aaron Richter
Experience Level: Intermediate
Dask is a parallel computing library for Python people. This talk will be a gentle introduction to Dask, showing how you can improve the speed of data science code on your laptop with a simple "pip install". Then we will use the same code to process big data on a cluster of machines. We will be going through an end-to-end data science pipeline, from ETL and exploratory analysis to machine learning model training and scoring.
We will cover:
- Example using publicly available data and single-node Python
- Pandas for data cleaning/transformation
- Scikit-learn for machine learning
- How to parallelize this workflow on a laptop and then a cluster using Dask
- Distributed model training
- Distributed inference/scoring