Speaker Series: Computational Sports Informatics
|
 |
---|
Overview
The sports industry in the US alone is projected to reach $73.5 billion by 2019. With a lot of revenue at stake, the various stakeholders (including individual teams, the leagues, sports networks, and data collection/curation companies) turn to analytics to gain a competitive advantage in the market. This, in conjunction with the advancement in computing technology for collecting, storing, and analyzing data has led to the explosion of sports analytics. This speaker series will include talks that will introduce the state-of-the-art on the use of statistics, computational methods, data science and machine learning in the analysis of sports.
Organizer: Konstantinos Pelechrinis (kpele AT pitt.edu)
Resources:
- D. Oliver, "Basketball on Paper", Potomac Books.
- B. Alamar, "Sports Analytics: A Guide for Coaches, Managers, and Other Decision Makers", Columbia University Press.
- T. Moskowitz, "Scorecasting: The Hidden Influences Behind How Sports Are Played and Games Are Won", Three Rivers Press.
- W.L. Winston, "Mathletics: How Gamblers, Managers, and Sports Enthusiasts Use Mathematics in Baseball, Basketball, and Football", Princeton University Press.
- T. Miller, "Sports Analytics and Data Science: Winning the Game with Methods and Models", FT Press.
- P. O'Donoghue, "Data Analysis in Sport", Routledge Studies in Sports Performance Analysis.
Schedule
- Title:Learning to Play Defense
Abstract: Encoding and/or constructing generative models of collective behavior is a complex task due to the dynamic nature of the couplings between agents. The natural state space of endogenous factors, such as group topology, and exogenous factors, such as the presence and position of other objects, agents, and signals, is far too high dimensional to be experimentally probed within a laboratory setting. New techniques in reinforcement and imitation learning, however, provide an avenue for progress using large quantities of trajectories from observational video data.
In this talk, we will examine these challenges within the context of professional basketball. The goal is to construct unsupervised models which are capable of synthesizing realistic, responsive NBA team defensive behaviors. We will present compelling progress towards this goal and discuss the key technical insights that have made this possible. Specifically, we will focus on issues of data routing, feature selection, network architecture, and multi-model training.
Speaker Bio:Andrew Hartnett is a physicist, ecologist, and educator. His research interests center on extending principles from information theory and machine learning to problems in collective behavior. He received his Ph.D. in 2017 from Princeton University where he studied the mechanisms of coordinated movement and consensus decision-making in animal groups. As a postdoc at Disney Research, Andrew focused on understanding collective behavior in team sports: developing deep recurrent models for encoding and predicting player trajectories in basketball. Currently, he is working on understanding and modeling the complex interactions encountered by autonomous vehicles as an engineer at Argo AI.
Location:IS Building, 3rd Floor theater
Day and Time:Friday, October 26 - 1:00pm
- Title:The Analytics Kill-Chain
Abstract: Analytic tools are often taught with toy data sets without much connection to real-world purposes. And much of the research in sports analytics is done for its own sake, producing somewhat interesting trivia without providing any insight for a decision-maker. Applied analysis needs to have a focused end result, and all intermediate steps in the process should be connected to and support that result. This is a similar concept to what the U.S. military calls the kill-chain.
Speaker Bio:Brian Burke is a senior analytics specialist for ESPN. Prior to joining ESPN, Brian was a consultant to several NFL teams and a regular contributor to the NY Times, Washington Post, Slate, NBC Sports and other outlets. He developed the core models and tools used throughout football analytics today, and much of his research can be found at his former website AdvancedFootballAnalytics.com. Brian has degrees from the Naval Postgraduate School and George Mason University. He is a graduate of the US Naval Academy and is a former Navy carrier pilot and combat veteran.
Location:IS Building, 4th Floor, 404
Day and Time:Friday, October 27 - 2:30pm
- Title:nflWAR: A Reproducible Method For Offensive Player Evaluation in Football
Abstract: The NFL lacks comprehensive statistics for evaluating player performance. One answer to this
need was Total Quarterback Rating (Total QBR; Oliver, 2011). However, Total QBR is built on
proprietary data, is not defined on a scale convertible to wins, and is only available for the QB
position. We introduce our reproducible method for calculating Wins Above Replacement (WAR)
for offensive skill positions in football, nflWAR, based on publicly available NFL play-by-play data
from 2009-2016 accessible with nflscrapR. First, using our novel multinomial logistic regression
expected points model, we estimate the “true” value for each play with expected points added
(EPA; Burke et al., 2015). Then, similar to work measuring pitcher and catcher value in baseball
(Judge et al., 2015), we extend a multilevel model approach to isolate the EPA contribution made
by individual offensive players and teams as random effects while accounting for
variables relating to the game situation as fixed effects. Next, we establish a robust way to define
“replacement” level for each offensive skill given the historical play-by-play data. Finally, the
expected points above replacement is converted to WAR (which we provide for all skill position
players dating back to 2009) based on the observed relationship between points scored and wins.
We emphasize that our reproducible nflWAR framework can be extended to estimate WAR for
non-skill position players (e.g. linemen, linebackers, etc.) if provided with data specifying which
players are on the field for both teams every play.
Speaker Bio:Ronald Yurko is a first year statistics Ph.D. student at Carnegie Mellon University. He received his B.S. in statistics at Carnegie Mellon in December 2015. At CMU he co-founded with Maksim Horowitz the Carnegie Mellon Sports Analytics Club. Ron previously worked as a baseball operations data and analytics intern for the Pittsburgh Pirates during the 2014 season, as well as a quantitative analyst in the financial services industry.
Location:IS Building, 3rd Floor
Day and Time:Thursday, October 12 - 2pm
- Title:Sports Analytics: Past, Present, and Future
Abstract: In this talk we will discuss the past, present, and future applications of sports analytics to baseball, football, and basketball. Promising areas for future research will be discussed as well as the current applications of analytics to these sports.
Speaker Bio:Wayne is a Visiting Professor at the Bauer College of Business at the University of Houston and Professor Emeritus of Decision Sciences at the Kelley School of Business at Indiana University. He holds a BS in mathematics from M.I.T. and a PH. D in operations research from Yale. He won over 40 teaching awards at Indiana University, including the Top MBA teaching award (6 times) and an EMBA teaching award at the Bauer College of Business at the University of Houston. He has written over a dozen books including Marketing Analytics, Data Analysis and Decision Making, Operations Research, Practical Management Science, Excel 2016 Data Analysis and Business Modeling, and Mathletics. He has also authored 25 refereed. Articles. He has taught classes or consulted for many organizations including, Abbott, SABRE, BROADCOM, Cummins Engine, Eli Lilly, James Hardy, MGM,Pfizer, Sabre, Verizon, Microsoft, Cisco, US Navy, US Army, Ford, 3M, and GM. He is also a two-time Jeopardy! Champion and has consulted for the NBA’s Dallas Mavericks and New York Knicks.
Location:IS Building, 3rd Floor Theater (135 North Bellefield ave)
Day and Time:Friday, April 21 - 10am
- Title:Modeling Sequential Decision Making in Team Sports using Imitation Learning
Abstract: Current state-of-the-art sports statistics compare players and teams to league average performance, such as "Expected Point Value" (EPV) in basketball. These measures have enhanced our ability to analyze, compare and value performance in sport. But they are inherently limited because they are tied to a discrete outcome of a specific event. For example, EPV for basketball focuses on estimating the probability of a player making a shot based on the current situation. In this work, we explore how teams control time and space by examining sequential decision making.
We have developed an automatic "ghosting" system which illustrates where defensive players should have been (instead of where they actually were) based on the locations of the opposition players and ball. We employ a machine learning method called deep imitation learning, and modify standard recurrent neural network training to consider both instantaneous and future losses, which enables ghosted players to anticipate movements of their teammates and the opposition. Our approach avoids the man-years of manual annotation need to train existing ghosting systems, and can be fine tuned to mimic the behavior of specific teams or playing styles.
Speaker Bio:Peter Carr i is a Research Scientist at Disney Research, Pittsburgh. His research interests lie at the intersection of computer vision, machine learning and robotics. Peter joined Disney Research in 2010 as a Postdoctoral Researcher. Prior to Disney, Peter received his PhD from the Australian National University in 2010, under the supervision of Prof. Richard Hartley. Peter received a Master's Degree in Physics from the Centre for Vision Research at York University in Toronto, Canada, and a Bachelor's of Applied Science (Engineering Physics) from Queen's University in Kingston, Canada.
Location:IS Building, 3rd Floor Theater (135 North Bellefield ave)
Day and Time:Friday, March 31 - 1pm
- Title:The Internet of Swings
Abstract: As a connected device company, Diamond Kinetics is collecting sensor data from the swings of players ranging from 8 years old to professional athletes. Baseball, as a sport, is filled with fantastic statistical analysis of on-field results. The SwingTracker sensor gives the kinematic "cause" to the on-field statistical "effect". Armed with this data, we'll discuss how Diamond Kinetics approaches visualizing its data for comprehension and action as well as how Diamond Kinetics is approaching machine learning. Connected devices are an inevitable part of the future and the statistics-driven world of baseball will likely find a way to unlock the potential of such a rich data set.
Speaker Bio:Mike is a native of East Brunswick, New Jersey, where his high school volleyball team won two state titles (Go Bears!). Carnegie Mellon University brought Mike to Pittsburgh, where he studied Computer Science and graduated in a record 2.5 years. Combining his passion for sports and computer science, Mike started StatEasy in 2010, a statistics and video company based in Pittsburgh, Pennsylvania. Most recently, Mike is the Director of Engineering for Diamond Kinetics, Inc. a connected device company focusing on the movement data of athletes in baseball. Mike's hobbies, aside from playing and coaching volleyball, include board games, biking, scuba diving, and quadcopter (drone) racing.
Location:IS Building, 3rd Floor Theater (135 North Bellefield ave)
Day and Time:Friday, February 17 - 1pm
- Title:Best Practices for Analytics in Sports Industry
Abstract: The professional sports word is becoming more strategic pertaining to driving revenue through various departments within the organization through business intelligence. In addition, business analytics is becoming a key driver in building teams on the court, floor and playing field. We will highlight how information technology and analytics is being adopted by various sectors of the sports industry through best practices, challenges, and opportunities. An example would be how business analytics affects season ticket pricing for a new D-League franchise. As important what additional challenges we and others in the industry face and the skill set needed to solve the challenges.
Speaker Bio:Steve Swetoha is the current President of the Greensboro Swarm in North Carolina. Prior, he spent six years as president, general manager and chief revenue officer of the Tulsa Shock in the WNBA. Swetoha has experience in the NFL, NBA, NHL, WNBA and the ACC. A native of the Pittsburgh area, Swetoha earned his bachelor’s degree from Robert Morris University. In 2010, he was elected to their Sports Management Hall of Fame. He holds a master’s degree in sports leadership from Duquesne University.
Location:IS Building, 3rd Floor Theater (135 North Bellefield ave)
Day and Time:Friday, February 10 - 9:30am
- Title:Scheduling Major League Baseball
Abstract: Since 2005, I (as part of a small firm, the Sports Scheduling Group) have been involved with putting together Major League Baseball's team and umpire schedules. I will talk about what goes into such schedules and the role optimization and "big data" plays in creating schedules.
Speaker Bio:Michael Trick is the James H. and Harry B. Professor of Operations Research and Senior Associate Dean, Faculty and Research at the Tepper School of Business, an institution he joined in 1989. He is a researcher in computational integer programming, with interests in sports scheduling and computational social choice. He was President of the Institute for Operations Research and the Management Sciences (INFORMS) in 2002 and will be President of the International Federation of Operational Research Societies (IFORS) in 2016-2019. He has consulted for Major League Baseball and many college conferences on scheduling issues and with the FCC, IRS, and United States Post Office on optimization approaches. He is a Fellow of INFORMS.
Location:IS Building, Room 501 (135 North Bellefield ave)
Day and Time:Friday, January 20 - 1pm
- Title: Winning in Sports with Statistics
Abstract: The path to becoming successful in the sporting world is substantially different from that of more traditional fields. Whether you're an athlete or an analyst, the best way to be successful working in sports is to effectively demonstrate your ability to have a positive impact on a team or organization. In this talk, I will discuss the skills that can be easily acquired in an academic program that are extremely important to having success as a data scientist in the sporting world, such as programming, data visualization, and professional communication (writing and speaking). This will be framed in my personal experiences from working with professional sports teams, doing research on statistics in sports, reviewing papers on sports analytics, coaching ice hockey, and teaching students about statistics. Although my own work is primarily in hockey, I will also discuss the use of statistics and data science in football, baseball, and basketball.
Speaker Bio:Sam Ventura is a Visiting Assistant Professor of Statistics at Carnegie Mellon, with research modeling the spread of infectious diseases, developing new tools for record linkage, and developing new statistical methodology for supervised and unsupervised learning. In addition to his academic appointments, Sam is a member of the statistical advisory board for the Houston Astros and is currently a statistical consultant for the 2016 Stanley Cup Champion Pittsburgh Penguins. Sam has published papers on sports analytics, most notably in the Annals of Applied Statistics, and has presented his work at numerous statistics and sports analytics conferences. He is an associate editor for the Journal of Quantitative Analysis in Sports and a reviewer for the Journal of Sports Analytics. In 2014, along with Andrew Thomas, he created war-on-ice, the most popular online resource for modern hockey statistics. Sam is a Pittsburgh-native and earned his PhD in Statistics from Carnegie Mellon in 2015.
Location:IS Building, 3rd Floor Theatre (135 North Bellefield ave)
Day and Time:Friday, January 13 - 1pm
This page was last updated on 12/11/2018 04:02:02.