Data Science Podcasts

6 minute read

The Talking Machines

Hosts: Ryan Adams, Katherine Gorman
Episodes: 42

Probably my all time favourite. If I had to recommend a single show, this would be it. Katherine Gorman and Ryan Adams discuss topics at the cutting edge of machine learning. The show feels pitched at a slightly more technical audience than some of the others below, but anyone working in fields related to statistics or ML will enjoy this. Expect to hear about neural nets, MCMC, Gaussian processes, autoencoders, restricted Boltzmann machines, densities, likelihoods, loss functions - this is not a show that treats ML algorithms as software black boxes, and a lot of the gory details are covered. The format is great, and the discussion is often motivated by a hot idea in the community, or arising from recent machine learning research. The clarity with which Ryan is able to boil down highly complex ideas is one of the best parts - you have to hear it, but it is seriously well done. This is probably not a podcast to listen to while doing the shopping - I needed to pay close attention to begin with, but the reward is that you can learn a lot from this show. The show usually features guest interviews, many of these made during recent conferences, and the list of guests over the first two seasons features many of the top people working in ML. Each interview explores the complex trajectories guests took to reach their positions, before discussing their interests in machine learning. Listener questions feature towards the end, and here the listener is exposed to more practical aspects of ML, including strategies to train models or select model architecture. The podcast is medium length, with episodes around 40 minutes, which I find is perfect. It feels carefully edited and there is very little superfluous dialogue, even during interviews, and so it’s a very efficient listen. Audio quality is fantastic and there are is no advertising on the show. The show is just beginning it’s third season now, loses Ryan Adams to be replaced by Neil Lawrence - give it a listen!

Not So Standard Deviations

Hosts: Roger Peng, Hilary Parker
Episodes: 38

As the name suggests, this podcast has it’s roots in statistics, but the pitch feels just right for the data science community. As a statistician, I really like the perspective that is brought by the show, which is somewhat different to the more computer science-y view that is sometimes prevalent. The show’s content cuts across different aspects of data analysis, but expect to hear commentary on experimentation, visualisation, uncertainty and model checking that will resonate with scientists of a statistical persuasion. NSSD has a nice informal feel and follows a very relaxed conversation between Roger Peng (a Johns Hopkins professor in biostatistics) and Hilary Parker (Hopkins biostatistics PhD grad and data scientist at Stitch Fix). Conversation covers broad topics related to software tools, data analysis and modelling issues that arise in academia and industry. Shows sometimes feature guests from data science or statistics communities, and these are often on the side of application and tools. The show doesn’t explicitly emphasise software, but with it’s statistical roots, R makes a regular appearances and particularly the tidyverse range of R packages, and I find it a good way to keep up-to-date with developments there. The sound quality is mostly excellent, although the presenters converse remotely and the sound occasionally suffers a bit, although this has improved since the early episodes. The podcast has a patreon page where you can support further episodes. The episodes average about an hour in length.

Partially Derivative

Hosts: Jonathon Morgan, Vidya Spandana, Chris Albon
Episodes: 89

The show is hosted by Jonathon, Vidya and Chris (sometimes a subset of the three), each highly accomplished data scientists. During the typical episode, listeners join the hosts while they chat over a beer and discuss elements of data science. The hosts are very dynamic and engaging, with really clear enthusiasm for data analysis and particularly for learning. Of all the shows I listen to, this is the one that I finish feeling inspired to try new things and to read more widely. The show feels pitched at a quantitative listener that is interested in developing their skill working in data science-related areas. I like that the show brings a very broad perspective on the day-to-day practise of data analysis, and takes a very pragmatic view on the role of domain knowledge, software, data and processing architecture and domain knowledge that are key to the practise of data analysis, but often overlooked. The show feels like a gathering of the presenters sharing stories from their work, and is funny and very relaxed. During some episodes some of the presenters have spoken individually on more specific topics, for example - the value of getting a PhD, and continuous learning. The show is very rich in information and ideas, and is well-supported by show notes on the website. The episodes are usually 30 - 40 mins and sound quality is great. The show is unusual in having attracted sponsorship, and though there is usually a short ad at the beginning, these are short and relevant.

Linear Digressions

Hosts: Katie Malone, Ben Jaffe
Episodes: Quite a few.

Pitched squarely at data scientists, this show does a really good job of introducing the principles of machine learning techniques and new data analysis. The technical level is balanced really well, technical jargon is used sparingly, but the show also doesn’t shy away from discussing the guts of an algorithm when it needs to. The variety is particularly good - some episodes cover topics in modelling that are directly relevant to data scientists, while others take a high level view of how modelling is integrated into broader frameworks and architectures. As an example of the latter - a recent episode on the use of federated learning to combine machine learning performed on mobile devices with a cloud based model was fascinating, and probably this is the only show I’d have learnt about this. The show is really well prepared, and the conversation between Katie and Ben is natural and fun to listen to. One of the things I really like about the show is that it is really bite sized - usually around 20 mins, and the hosts do a really great job of summarising a topic in this time and still finding time to discuss it.

Other notable shows

  • Adversarial learning. A new(ish) podcast with broad conversations on data science featuring guests from tech and industry.
  • Becoming a Data scientist. Hosted by Renee Teate, with lots of interesting guest interviews, with an emphasis on those beginning in data science and looking develop their technical skills.
  • Data Science at Home. I only just discovered this, but the list of topics on statistical ML looks great.
  • O’ Reilly data show
  • Programming throw down. Not data science, but coverage of topics from the broader world of technology and programming, software development, information security.
  • Recode Decode. Kara Swisher interviews tech founders. Ok, this isn’t not data science at all, but since a lot of new ML development is driven by activity in Silicon Valley tech companies, it is worth keeping half an eye on high level developments. Automation and AI are appear a lot.
  • R Weekly podcast. A short news digest from the world of R programming. Delivered by a machine, a little weird.
  • The R Podcast A new(ish) show covering news, developments and examples from the R community.
  • The Stack Overflow podcast. A more general programming podcast, I have only just discovered this, but will report back after a few listens!