April 6, 2023

Sometimes I get questions about what tools I used to do some piece of work — in other words, about my “personal tech stack”. Rather than continuing to answer these questions as one-offs, I decided it would make sense to write it down and share it in one place.

Check it out here:

Feel free to send me any questions or comments on Mastodon.

Breaking Into Data Science with PyData Pittsburgh

September 15, 2022 Talks PyData Pittsburgh

I was pleased to serve as a panelist and moderator for PyData Pittsburgh’s latest panel discussion, Breaking Into Data Science. The event’s primary goal was to share advice about what it takes to get a job in data science or machine learning, especially for recent graduates or mid-career professionals looking to transition into the field.

I was joined by a group of experienced data science managers and leaders from across tech, healthcare, and academia, including:

Elizabeth Milkovits, data science manager at Amazon
Marie Skoczylas, data scientist and analytics leader at Highmark Health
Peter Casey, program director for Data Science for Social Good at Carnegie Mellon University

We talked about our perspectives having served on both sides of job interviews for data science and machine learning roles, what we look for in new hires, and the most effective things aspiring data scientists and machine learning engineers can do to stand out to hiring managers. We were having such a good conversation, we ran 30 minutes over the originally-scheduled time!

Special thanks to tech incubator OneValley at the Roundhouse for hosting us in their fantastic event space.

This post was originally published on datatheoretic.com.

Jupyter in Production at PyData Pittsburgh

August 18, 2022 Talks PyData Pittsburgh

I was pleased to present a version of my talk Jupyter in Production for a recent gathering of the PyData Pittsburgh meetup group. We had a great turnout, and I enjoyed the opportunity to engage with attendees in an extended and thoughtful question-and-answer session after the talk.

You can find the slides for my talk below, as well as links to all the tools, references, and resources I highlight.

Linked References and Resources

This post was originally published on datatheoretic.com.

Re-Launching PyData Pittsburgh

June 23, 2022 PyData Pittsburgh

I’m happy to share that I’ve joined the leadership team of the PyData Pittsburgh group. With over 900 members on Meetup.com, PyData Pittsburgh serves as the primary Python meetup group for the Pittsburgh region, and as a hub for the local data science and data engineering communities. As part of re-launching the group, we’ve formally affiliated with NumFOCUS, the premier organization promoting open source software for scientific computing.

I’m pleased to partner with Pete Fein in this effort, as well as the rest of PyData Pittsburgh’s core members. If you’re interested in data engineering in Python, be sure to check out the work Pete is doing over at Snakedev.

Engaging with local and global technology communities has always been a central part of my identity as a data scientist and leader, going back to my time as a leader for the Charlottesville Data Science Group and a founding organizer of the Applied Machine Learning Conference. I’m delighted to get back to work building communities for data scientists and data engineers!

If you’re interested in learning more about PyData Pittsburgh, please check us out on Meetup, LinkedIn, or reach out to me directly.

This post was originally published on datatheoretic.com.

Jupyter in Production at Rev 3

May 5, 2022 Talks

I was pleased to be invited to present the talk Jupyter in Production at the Rev 3 MLOps Conference in New York City. Jupyter Notebooks have become an essential part of the data scientist’s toolkit, but deploying work developed in notebooks into production settings can be painful. Over the last few years, I’ve been excited to see a new ecosystem of tools emerge that push the limits of what you can do with Jupyter Notebooks in production environments.

In this talk, I highlight how some of these tools can help you to develop and distribute software libraries, build and run data pipelines, and create and serve interactive reports and dashboards — all without leaving Jupyter Notebooks. You can find the slides for my talk below.

Linked References and Resources

This post was originally published on datatheoretic.com.

Data Science Leadership Exchange with Data Science Central

August 25, 2020 Talks

I was pleased to serve as a panelist for the latest Data Science Central panel discussion and webinar, Data Science Leadership Exchange: Best Practices for Driving Outcomes. I joined Brian Loyal, Cloud Analytics Lead at Bayer Crop Science, Matt Cornett, Director of Data Science at Transamerica, and host Bill Vorhies. We talked about the pros and cons of different organizational structures for data science teams and ideas for optimizing each stage of the data science project lifecycle, from the initial project idea to managing machine learning models in production.

Check out the video below to go deeper!

This post was originally published on datatheoretic.com.

Sharing my Personal Tech Stack

Breaking Into Data Science with PyData Pittsburgh

Jupyter in Production at PyData Pittsburgh

Linked References and Resources

Re-Launching PyData Pittsburgh

Jupyter in Production at Rev 3

Linked References and Resources

Data Science Leadership Exchange with Data Science Central