本研究針對由臺灣大學大氣科學系研發的新型微型探空儀器Storm Tracker (以下簡稱ST)開發了基於機器學習模型的校正流程以及其器差(instrument error)校正實驗。為了解決探空觀測時,太陽輻射加熱影響感測晶片所導致的溫度與濕度誤差,我們採用了廣義線性模型(Generalized Linear Model, GLM)以及梯度提升模式(Gradient Boosting Model, GBM)作為核心,並且將2018至2022年間在台灣的觀測實驗中,ST與氣象局作業用探空Vaisala-RS41 (以下簡稱VS)的931次共同施放(co-launch)對比數據作為訓練輸入資料,以此為基礎開發出先校正溫度再校正水氣量的二階段校正核心。參考Ciesielski et al. (2012)所建議之探空資料校正步驟,建置模組化之ST資料校正流程,可提供準確、穩定且有效率的資料校正結果。
同時,我們也依據ST之儀器特性,設計合適之器差實驗。透過平行觀測比對的實驗,將ST與WXT地面自動氣象站的資料以線性回歸的方式計算誤差,利用線性平移修正的方式達到資料橋接的效果。其後使用經過校準之ST作為基準機,對其他ST在穩定環境之箱子內進行批次同步觀測。本研究中,一共累計了270次的器差實驗之數據,並透過卡方檢定(Kolmogorov-Smirnov test, K-S test)判定ST的器差為常態分布。在經過統計線性修正後,70%以上之ST觀測誤差值會小於儀器本身之量測精準度,且很好地維持了ST本身的觀測分布特徵。這樣的結果確保了後續機器學習模型校正的套用,更證明未來在實際操作中進行即時地面校正的可行性。
最後我們分析了機器學習模型的穩定度以及校正能力比較,透過泰勒圖(Taylor Diagram),我們可以發現線性校正模型(GLM),在低層(1000-700hPa)展現出良好的校正效果,但在700hPa以上的資料其相關性及資料分布隨高度逐漸偏離基準值。而具有較複雜結構之非線性模型(GBM),則是在統計上維持了非常好的校正效果,但由於其透過非線性演算法達成校正效果,我們較難以解釋其校正過程中的機制及物理意義。
在實際的野外實驗中,我們注意到ST探空時序列的溫度剖面中,清楚顯示出ST在溫度測量上有偏高的誤差,直接導致了水氣量計算上的高估。在經過機器學習模型校正後,不僅將這些偏暖及偏濕的誤差進行修正,也在整體大氣的垂直分布中帶來更多細節。但同時,我們也注意到GBM的校正資料由於其使用決策樹集合的修正原理,有較明顯的抖動/高頻訊號出現。經過本研究的校正流程處理後的資料,可有效降低邊界層觀測中探空資料受太陽輻射加熱影響所導致的暖偏差及水氣高估,這樣的修正不僅使ST的資料可信度大幅提升,對於邊界層的熱力作用、近地面的水氣輸送過程等相關研究能有顯著的幫助。
This study focuses on developing a machine learning-based calibration process and instrument error correction experiment for the novel mini-radiosonde instrument, Storm Tracker (ST). To address temperature and humidity errors caused by solar radiation heating the sensing chip during sounding observations, we adopted the Generalized Linear Model (GLM) and Gradient Boosting Model (GBM) as the kernel. We used the co-launch radiosonde data from 931 ST and operational sounding Vaisala-RS41 (VS) from the Central Weather Bureau between 2018 and 2022 as training inputs to develop a two-stage calibration approach. The model corrected the temperature first and then calibrated the humidity. We followed Ciesielski et al. (2012) to construct a modularized ST data calibration process, which provides accurate, stable, and efficient calibration results.
We designed a suitable experimental platform based on ST's characteristics to test the instrument's measurement error. We calculated the measurement bias between the ST and the surface weather station (WXT) through parallel intercomparison experiments. We used using linear regression to bridge the data through linear bias correction. These calibrated instruments were used as the standard reference for instrumental error measurements. We measured over 270 ST radiosondes to collect the instrumental error information. The data show that the measurement error of ST is a normal distribution. About 70% of the ST instrumental errors can be reduced to within the limitation of instrument accuracy after statistical error correction. The remaining 30% of the instruments still retain the characteristics of random errors, which can be corrected through machine learning models.
Furthermore, we analyzed the stability and calibration performance of the machine-learning models. The linear calibration model (GLM) exhibits good calibrating capability for the observations of the planetary boundary layer (1000-700 hPa). On the other hand, the non-linear calibration model (GBM) with a more complex structure maintains even better calibrating performance statistically. However, due to its non-linear algorithm for achieving calibration, explaining the mechanisms and physical meanings during the calibration process is more challenging. Our machine-leaning-based data calibration model can correct the warming bias and the water vapor overestimation in the original observation data. And at the same time, it retains the characteristics of the vertical change of the atmosphere in the original observation data.