上一篇舉了一個例子,來說明過往的程式交易作法是把我們的操作邏 輯寫成腳本,把決策過程清楚的定義後,拿歷史資料來回測看其勝率,再來決定這個策略要不要用? 要用到什麼商品? 在什麼情況下進場? 出場?,所有的決策邏輯,用and,or,not,xor來判斷,一般我們稱為rule base的決策方式。今天我想舉實際的例子,透過XS把我覺得會影響股價的因素列出來,準備足夠的樣本,然後用Python的AI模組,試著看看透過多層感知器這樣的模型,能不能達到預測未來多空方向的效果。
首先,先來說明一下我的思維架構
我認為,人工智慧的演算過程,是用已知的X1,X2,X3……….Xn,去建構一個函數,讓這個函數產出的Yf,愈接近真實的Yt愈好。
Yf=f(x1,x2……….xn)
min(Yf-Yt)
所以建構人工智慧的演算模型,一共有幾件事情要做
一,決定x1,x2……….xn等等的輸入特徵值
二,決定要找的答案是什麼?(也就是輸出值是什麼?),要預測的是什麼?
三,準備好可以讓電腦演算學習的資料
四,建構演算模型
五,決定衡量預測能力的標準
例如我想透過昨天收盤後的各種數據,想要預測今天某檔個股會不會比前一天上漲?
首先,要決定那些數據會影響隔天的行情?
在這個例子裡,我一共用了三種數據
- 昨天跟大家介紹的股性相關數據,超過20天平均水準一定的比例就記為1,不然就記為0
- K棒本身開高低收的相對位置
- 幾個常用技術分析數據的值
根據上述的作法,我用XS所寫的整理數據腳本如下
variable:v1(0),v2(0),v3(0),v4(0),v5(0),v6(0),v7(0),v8(0),v9(0),v10(0) ,v11(0),v12(0),v13(0),v14(0),v15(0),v16(0),v17(0),v18(0),v19(0),v20(0); variable:v21(0),v22(0),v23(0),v24(0),v25(0),v26(0),v27(0),v28(0),v29(0),v30(0); var:y(0); //如果某個欄位表現異於平常就記為1,不然就記為0,以0跟1代表某股性的特徵值 input:day(20); input:ratio(30); variable:count(0),x(0); value1=GetField("總成交次數","D"); value2=average(value1,day); value3=GetField("強弱指標"); value5=GetField("外盤均量"); value6=average(value5,day); value7=GetField("主動買力"); value8=average(value7,day); value9=GetField("開盤委買"); value10=average(value9,day); value11=GetField("資金流向"); value12=average(value11,day); value13=countif(value3>1,day); value14=average(value13,day);//比大盤強天數 value16=GetField("法人買張"); count=0; if value1>value2*(1+ratio/100) then v1=1 else v1=0; if value13>value14*(1+ratio/100)//比大盤強的天數 then v2=1 else v2=0; if value5>value6*(1+ratio/100) then v3=1 else v3=0; if value7>value8*(1+ratio/100) then v4=1 else v4=0; if value9>value10*(1+ratio/100) then v5=1 else v5=0; if truerange> average(truerange,20)//真實波動區間 then v6=1 else v6=0; if truerange<>0 then begin if close<=open then value15=(close-low)/truerange*100 else value15=(open-low)/truerange*100;//計算承接的力道 end; if value15>average(value15,day)*(1+ratio/100) then v7=1 else v7=0; if volume<>0 then value17=value16/volume*100;//法人買張佔成交量比例 if value17>average(value17,10)*(1+ratio/100) then v8=1 else v8=0; if value11>average(value11,10)*(1+ratio/100) then v9=1 else v9=0; x=0; value18=summationif(close>=close[1]*1.02,x,5); if value18>=2 then v10=1 else v10=0; ;//N日來漲幅較大的天數 value19=GetField("融資買進張數"); value20=GetField("融券買進張數"); value21=(value19+value20); value22=average(value21,day); if value21<value22*0.9 //散戶作多指標 then v11=1 else v11=0; if close*1.2<close[30] then v12=1 else v12=0; //把一根K棒開高低收四點彼此間的差異列成六個不同特徵值 v13=(close-open)/close*100;//漲跌幅 v14=(close-low)/close*100;// v15=(high-close)/close*100; v16=(high-low)/close*100; v17=(high-open)/close*100; v18=(open-low)/close*100; //把股價單日漲幅是否有超過2.5%當成一個特徵值 if close>close[1]*1.025 then v19=1 else v19=0; //把近兩日合計漲跌幅視為一個特徵值 v20=close/close[2]; //把幾個常用的技術指標的計算結果也視為特徵值 variable:rsv1(0),k1(0),d1(0); stochastic(9,3,3,rsv1,k1,d1); v21=k1; input: period(20,"計算區間"); value1=rateofchange(close,period); //計算區間漲跌幅 value2=arctangent(value1/period*100); //計算上漲的角度 v22=value2; input: Length1(14, "期數"), Threshold(25, "穿越值"); variable: pdi_value(0), ndi_value(0), adx_value(0); DirectionMovement(Length1, pdi_value, ndi_value, adx_value); v23=pdi_value; input: FastLength(12, "DIF短期期數"), SlowLength(26, "DIF長期期數"), MACDLength(9, "MACD期數"); variable: difValue(0), macdValue(0), oscValue(0); MACD(weightedclose(), FastLength, SlowLength, MACDLength, difValue, macdValue, oscValue); v25=difvalue; v26= momentum(close,10); value6=rsi(close,12); v27=value26; v24=linearregslope(value26,6); input:Length2(20); //"計算期間" variable:u1(0),u2(0),u3(0),u4(0),u5(0),u6(0); LinearReg(close, Length2, 0, u1, u2, u3, u4); //做收盤價20天線性回歸 {u1:斜率,u4:預期值} u5=rsquare(close,u4,20);//算收盤價與線性回歸值的R平方 v28=u5; v29=u1; value11=GetField("投信買賣超"); input:day1(8); v30=countif(value11>0,day1); //定義輸出的Y值 if close>close[1] then y=1 else y=0; Print(file("C:\Users\lee\.spyder-py3\f301.log"), numtostr(v1[1], 0), ",", numtostr(v2[1], 0), ",", numtostr(v3[1], 0), ",", numtostr(v4[1], 0), ",", numtostr(v5[1], 0), ",", numtostr(v6[1], 0), ",", numtostr(v7[1], 0), ",", numtostr(v8[1], 0), ",", numtostr(v9[1], 0), ",", numtostr(v10[1], 0), ",", numtostr(v11[1], 0), ",", numtostr(v12[1], 0), ",", numtostr(v13[1], 2), ",", numtostr(v14[1], 2), ",", numtostr(v15[1], 2), ",", numtostr(v16[1], 2), ",", numtostr(v17[1], 2), ",", numtostr(v18[1], 2), ",", numtostr(v19[1], 0), ",", numtostr(v20[1], 2), ",", numtostr(v21[1], 2), ",", numtostr(v22[1], 2), ",", numtostr(v23[1], 2), ",", numtostr(v24[1], 2), ",", numtostr(v25[1], 2), ",", numtostr(v26[1], 2), ",", numtostr(v27[1], 2), ",", numtostr(v28[1], 2), ",", numtostr(v29[1], 2), ",", numtostr(v30[1], 2), ",", numtostr(y,0));
我把這個腳本用策略雷達來跑台郡這檔股票,
這樣就會輸出一個文字檔,我把這個文字檔轉成CSV檔,加上表頭,準備給python的人工智慧模組當測試資料
接下來我用的Python多層感知器模組的程式碼如下
import numpy as np
import pandas as pd
# 讀入CSV資料
df = pd.read_csv(‘f303.csv’)
df.head()
# 取所需的欄位資料
cols_2d = df[[‘v1′,’v2′,’v3′,’v4′,’v5′,’v6′,’v7′,’v8′,’v9′,’v10′,’v11’,
‘v12′,’v13′,’v14′,’v15′,’v16′,’v17′,’v18′,’v19′,’v20′,’v21’,
‘v22’, ‘v23’, ‘v24’, ‘v25’, ‘v26’, ‘v27’, ‘v28’, ‘v29′,’v30′,’yy’]]
cols_2d.head()
# 取 feature, X
X = cols_2d[[‘v1’, ‘v2’, ‘v3’, ‘v4’, ‘v5’, ‘v6’, ‘v7’, ‘v8’, ‘v9’, ‘v10’, ‘v11’,
‘v12’, ‘v13’, ‘v14’, ‘v15’, ‘v16’, ‘v17′,’v18′,’v19′,’v20′,’v21′,’v22′,’v23′,’v24′,’v25’,
‘v26’, ‘v27’, ‘v28’, ‘v29′,’v30’ ]]
X.head()
# 取 label, y
y = cols_2d[‘yy’]
y.head()
# 分訓練與測試資料
from sklearn.cross_validation import train_test_split
X_train_o, X_test_o, y_train, y_test = train_test_split(X, y, test_size = 0.3) # 80%訓練, 20%測試
# 確認分後的筆數
print(‘total:’, len(cols_2d), ‘train X:’, len(X_train_o), ‘test X:’, len(X_test_o), ‘train y:’, len(y_train), ‘test y:’, len(y_test))
# 將 X 所有欄位進行正規化,將原數字壓到 0~1之間的數字
from sklearn import preprocessing
minmax_scale = preprocessing.MinMaxScaler(feature_range = (0, 1))
X_train = minmax_scale.fit_transform(X_train_o)
X_test = minmax_scale.fit_transform(X_test_o)
X_train[:10]
X_test[:10]
# 建立 MLP(多重感知器)模型
from keras.models import Sequential
from keras.layers import Dense, Dropout
model = Sequential()
model.add(Dense(units = 120, input_dim = 30, kernel_initializer = ‘uniform’, activation = ‘relu’))
model.add(Dense(units = 100, kernel_initializer = ‘uniform’, activation = ‘relu’))
#model.add(Dropout(0.35))
model.add(Dense(units = 80, kernel_initializer = ‘uniform’, activation = ‘relu’))
#model.add(Dropout(0.35))
model.add(Dense(units = 70, kernel_initializer = ‘uniform’, activation = ‘relu’))
#model.add(Dropout(0.35))
model.add(Dense(units = 1, kernel_initializer = ‘uniform’, activation = ‘sigmoid’))
model.compile(loss = ‘binary_crossentropy’, optimizer = ‘adam’, metrics = [‘accuracy’])
train_history = model.fit(x = X_train, y = y_train, validation_split = 0.1, epochs = 30, batch_size = 30, verbose = 2)
# 用測試資料預測
scores = model.evaluate(x = X_test, y = y_test)
# 預測準確率
scores[1]
透過這樣的運算,我們可以找到一組的參數,在用這些特徵值去預測隔日股價漲跌時的精準度可以達到66%
以上是跟大家透過舉例,介紹XS可以做為 人工智慧運算前整理特徵資料的平台,人工智慧博大精深,要用人工智慧來作投資操作,有很多的關上要克服,XS看來在特徵值的萃取及測試資料的整理上可以幫得上忙
至於演算的部份,就只有靠我們自己努力繼續唸書了。