文化大學機構典藏 CCUR:Item 987654321/25346
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 46962/50828 (92%)
Visitors : 12424650      Online Users : 1107
RC Version 6.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version


    Please use this identifier to cite or link to this item: https://irlib.pccu.edu.tw/handle/987654321/25346


    Title: 運用資料探勘技術於職棒比賽勝負預測之研究-以美國職棒大聯盟為例
    Studies on Predicting the Outcome of Professional Baseball Games with Data Mining Techniques: MLB as a Case
    Authors: 馮瑞祥
    Fong, Ruei-Shiang
    Contributors: 資訊管理學系
    Keywords: 類神經網路
    資料探勘技術
    比賽預測
    資料分析
    Artificial Neural Network
    Data Mining Techniques
    Game Prediction
    Data Analysis
    Date: 2013-06
    Issue Date: 2013-09-30 11:28:09 (UTC+8)
    Abstract: 職棒比賽非常注重數據收集及分析,因此每場比賽都會產生大量可供分析的數據。資料探勘技術是一項可在浩瀚的資料中分析出關鍵結果的電腦分析技術,以此技術來處理職棒的資料,不但可獲佳效,更可免去人工分析所產生的錯誤。本研究目的即是利用資料探勘的技術預測美國大聯盟職棒賽事之勝負與得分。

    本研究以美國職棒大聯盟30支隊伍在2000到2012年所有例行賽賽事為研究對象,投入之變數,為各隊賽前十場比賽野手與投手各項表現之加總平均數。首先使用「皮爾森積差相關分析」除去與勝負較無相關之變數與具有複共線性之變數,以挑選出適當之投入變數,再利用「類神經網路」中的「倒傳遞網路」將挑選出之變數投入並建立模型。以前100場作為模型之訓練集,剩下62場投入模型之鑑效,取得比賽之預測比分後,再和實際比賽結果和賭盤之盤口作比較。

    實證結果利用產出之模型得到之主客隊預測比分,再與運動彩券的大小、勝分差、讓分盤口比較後,證實本研究所提出的模型有較佳的預測準確率。後續研究者或可改變投入之變數值,再代入本研究提出的模型,應可提升預測的準確率。
    Professional baseball games emphasize data collection and analysis because each game provides plenty of data that needs to be analyzed. Data mining methods involve computer analysis techniques with which a crucial outcome can be found from a huge amount of data. The data mining techniques thus can be used to efficiently analyze the data of professional baseball and also avoid the mistakes often caused by manual analysis. This study aims to predict the outcome and scores of professional baseball games in MLB.

    The data of the study are all the regular season games from 2000 to 2012 of thirty teams in MLB. The variables are the average statistics of both the fielders’ and the pitchers’ performances in the last ten games. First, we used the Pearson product-moment correlation coefficient to delete the unrelated variables and variables of multicollinearity and to select the suitable variables. Then we applied the Back Propagation Network (BPN) of the artificial neural network to build a model for the selected variables. The first 100 games served as the training set of the model while the later 62 games as the validation set. After obtaining the predicted scores of each game, we compared them to the real outcome of the games and the money line.

    After using the output model to predict the scores of the host and the guest, we further compared them with the real outcome, run line, and money line of sports gambling. The experimental results have proven that the model of this study provided better prediction accuracy. Follow-up researchers may consider using different variables for the model to improve the accuracy of the predictions.
    Appears in Collections:[Department of Information Management & Graduate Institute of Information Management] Thesis

    Files in This Item:

    File Description SizeFormat
    fb130930112403.pdf1951KbAdobe PDF6231View/Open
    index.html0KbHTML440View/Open


    All items in CCUR are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback