A spreadsheet is a wonderful invention and an excellent tool for certain jobs. All too often, however, spreadsheets are called upon to perform tasks that are beyond their capabilities. It is like the old saying, “If the only tool you have is a hammer, every problem looks like a nail.” But some problems are better addressed with a screwdriver, with glue, or with a swiss army knife.
Python is often called the Swiss army knife of the programming world, due to its versatility and flexibility in use. That is why it has become increasingly popular over time. Companies can adopt Python to perform some uniquely complex processes over the long-term.
During this talk, Ryan will discuss his firsthand account of Excel Hell and how he managed to escape it using Python. He will also discuss of the relevant libraries he uses for web scraping, data processing, analysis, and visualization, including Requests, Pandas, Flask, and Airflow, as well as few strategies he uses when approaching problems with data.
Ryan S. McCoy is a Data Engineer at gotem, LLC, where he is responsible for helping modernize the systems, data infrastructures, and analytics of companies primarily in the Financial Services industry, including Investment Managers, Hedge Funds, Venture Capital funds, and data vendors. Previously he spent a decade at several institutional investment funds located in St. Louis.
Github Repo -> https://github.com/ryansmccoy/spreadsheets-to-dataframes
Python has a great library for interacting with kubernetes (k8s) clusters. This talk will discuss two quick tools to get your feet wet when it comes to interacting with k8s using python and show you some of the things to look out for, as well as the basics of local vs intra-cluster security.
"the phone calls are coming from inside the house!"
The first service is a simple flask based application that will be running as a pod inside the cluster exposing the endpoint using a Service and Ingress resources. When you call the "/pod/versions" endpoint, it will return the versions of any applications running in the cluster as JSON. There are some security constraints built into k8s that you should be aware of when trying to access the k8s API internally. We will walk you through how to allow this service to access this API even with Role Based Access Control (RBAC) enabled using a ServiceAccount. This method will only grant this specific service inside a particular namespace read-only access to pod information for the cluster.
The second application will make use of this flask endpoint and be run from your local command. k8s config file to get access. We will then use it to compare a secondary application running in a different namespace. This is a smaller version of some real world tooling we use at Rally Health as we migrate from mesos to k8s and need to compare state between these two environments as well as between clusters in different environments. These techniques are just the tip of the iceberg, but ideally they should give you some idea as to what the kubernetes python client is capable of handling.
pudb is a feature-rich terminal-based debugger that is a great alternative to Python's built-in debugger (pdb). This demo will demonstrate how to launch into the debugger, as well as how to use its remote functionality to connect to and troubleshot multi-process apps which do not run in the foreground.
Building services is important, but what happens after they are built and running in production? How do we establish trust with our customers that our service will actually be available? Who creates these definitions and how do we measure them? Service Level Indicators (SLI), Agreements (SLA), and Objectives (SLO) are central to an operations mindset and foundational tools for effective Site Reliability Engineering. This talk will take you on a journey through Springfield as we discuss exactly what SLIs, SLAs, and SLOs are, how to measure them, what targets should be measured, how to define uptime, availability, and acceptable error rates, and what happens when they are breached. Attendees will leave with a clear understanding of how to monitor and report for their services, how SLIs, SLAs and SLOs can aid in this process, and how to implement them within their own teams.
Ray is a framework for distribution and scaling of clustered, high-performance, Python applications. It is used in several ML/AI systems and production deployments. This talk explains the problems that Ray solves, including rapid execution of “tasks” and management of distributed state, such as model parameters during training. I’ll use several example applications to illustrate. You'll learn when and how to use Ray in your projects.
Most computer languages offer "int"s and "reals" and maybe some support for "complex" or fixed point decimal. Python goes further. This talk will discuss built-in numeric types (such as Rational and Decimal), numeric types from Numpy, and the Abstract Base Classes that make it possible to add your own specialized numeric type and have it appear as part of the language.
Have you ever wondered how your computers knows what programs are running? What about what happens behind the scenes when you start a program? This talk will cover the basics of how processes work, and how your operating system keeps track of what's running. By the end of it you will know enough to write your own basic versions of 'ps' or 'top'.
If you have attended a few ChiPy events, chances are you have used the chipy.org website. The ChiPy Web Guild is a group of volunteers that help maintain the site. In this talk, I will give a brief description of how the Web Guild works and touch on some aspects of the ChiPy.org site. We will then go through an example of how team members were able to address a flaw in the ChiPy.org code enhancing user experience. Finally, I will share some thoughts on what I learned and what the group might work on next.
The rapid growth of Python is, in part due, to it's exceptional toolkit for Data Analysts, Scientists, and Engineers. Packages like Pandas, Scikit-Learn, PySpark, and Dask have become staples for teams looking to process data. However, when processing large amounts of data there are times when Python might not be the right solution for your task. In this conversation, we'll learn about Cloud based Data Warehouses, such as Google's BigQuery, Amazon's Redshift, and Snowflake. You'll learn about the advantages of these platforms compared to in-memory processing in Python. We'll also show examples of how you can use Apache Airflow to automate recurring tasks, turning your Data Warehouse into the cornerstone of your Data Science infrastructure.
Join us as we describe our migration from a limiting cloud deployment on long-running VMs with shared infrastructure to a streamlined immutable infrastructure built on top of Docker and K8s. We'll also discuss techniques to support local development during this transition. Many teams wish they could reap the widely known benefits of Kubernetes (K8s), but most struggle to migrate to a new infrastructure while simultaneously supporting two deployment models and avoiding impacts to the velocity of software development. In this talk, we describe the particular challenges we faced during our incremental migration from multiple long-running singleton EC2 instances to a containerized solution. We'll highlight: - What challenges motivated us to transition to K8s? - Approaching an infrastructure migration incrementally to minimize impacts to local development and production deployments - Developing a solution to provide the same abstraction for local development that exists in production - Concurrently supporting multiple deployment models to reduce risk and simplify migration - Strategy variations for synchronous and asynchronous services - Networking challenges with Vagrant and Docker - Integrating K8s with a CI/CD pipeline - Tuning the environment