在 ArchSummit 北京 2019 大会上，Jose David Baena讲师做了《GitHub 大规模采用机器学习的痛点和破解之道》主题演讲，主要内容如下。

演讲简介：

Title: Adopting Machine Learning at Scale

Scaling up machine-learning (ML), data retrieval and reasoning algorithms from Artificial Intelligence (AI) for massive datasets is a major technical challenge in our time. The scaling process can also have different dimensions: performance, development productivity, number of employees…

In this talk I will showcase how we used to develop Machine learning features at GitHub, the pain points we had and how we changed our infrastructure and way of development in order to productionize multiple ML features in terms of hours/days.

In addition, I will explore with the audience the main factors I consider when scaling ML at medium to big companies.

By the end of the talk you should have an overview and applicable framework on how to help scaling ML processes in your company.

Talk outline：

Potential outline for the talk:

Introduction to ML at GitHub.
Challenges of running ML at scale. Different dimensions:
- Performance: number of requests
- Development: growing infrastructure, number of ML features
- Organizational: number of employees
ML ecosystem architecture.
Improving agility and development on ML features.
Adopting ML at scale in your company.

讲师介绍：

Jose David Baena，GitHub Senior Software Engineer。

Jose David Baena is a Senior Software Engineer at GitHub. He has more than 10 years experience in backend development, from startups to big companies, from Europe to the United States.

His experience ranges from building distributed low latency systems for financial companies to high performant crawlers for social media.

At the moment, he designs architectures that are used by the Machine Learning and Data Science teams at GitHub. He is passionate about distributed systems, machine learning scalability and developer productivity.

完整演讲 PPT 下载链接：

https://archsummit.infoq.cn/2019/beijing/schedule

创作场景

GitHub 大规模采用机器学习的痛点和破解之道