We are sports fans and baseball statistics is quite extensive and challenging to tackle
There are an incredible number of factors to consider for each player and game
Analyze baseball data to determine contributing factors to team wins
Discover correlations between player performance and team wins
Use analysis for FanDuel, fantasy league baseball
All Star Data: Sean Lahman
Houses a database that contains complete batting and pitching statistics from 1871 to 2018 as well as fielding statistics. Size: 3 MB
Spring Training and Regular Season Standings: MLB.com
Has a collection data detailing Spring Training and Regular Season standings from 1901 to 2019. Size: 1 MB
Pitching Statistics: Fangraphs
Contains individual and overall pitching statistics. Size: 2MB
World Baseball Classics data: WBC
Contains WBC data over the year. Size: 2MB
Could we get a glimpse into team's regular performance based on their stats in spring training?
Which statistic is most significant to a team success?
How does the World Baseball Classics Tournament impact players' performace?
Does the best performer always get picked for All Star?
Correlation analysis between certain traits like Batting average to determine wins
Linear regression analysis to verify correlations for future predictions on Regular season
Calculated batter and pitcher performance for players who participated in a WBC year and compare their performance in a WBC year with the year before the WBC tournament and the year after the WBC tournament. This was to see if a player's participation has any affect on their performace in the regular season.
Set up customized ranking for players
Use apriori to find frequency of attributes to match with wins
Spring Training does not have a big impact on Regular season outcome
There are certain traits such as run difference that yield more wins
WBC participation has a surprising positive impact on player experience
All Star data is not 100% correlated with individual performance