2019年6月10日 星期一

[股票]用Keras的RNN預測台積電股價走勢

步驟:
1. 下載台積電股價
2. 將資料做Normalization
3. 準備training set 和 test set:
    training set 的Features由60日內的開盤價的陣列組成, 每一筆training set 相差一天
    training set 的Labels由第70日的開盤價組成
    test set 則由training set最後一筆資料過後的股價組成
4. 用training set建立RNN model
5. 視覺化模擬過程
6. 用test set測試model

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

import datetime

from yahoo_historical import Fetcher
#下載台積電在費城半導體掛牌的TSM ADR的股價
data = Fetcher("TSM", [2007,1,1], [2019,1,1])
df=pd.DataFrame(data.getHistorical())
#print(data.getHistorical())
df=df.set_index('Date')
df.head()

df['Open'].plot()


#將資料做Normalization
training_set = df.iloc[:,0:1].values
from sklearn.preprocessing import MinMaxScaler
sc = MinMaxScaler(feature_range = (0, 1))
training_set_scaled = sc.fit_transform(training_set)

#準備training set
X_train = []
y_train = []
for i in range(60, 2035):
    X_train.append(training_set_scaled[i-60:i, 0])
    y_train.append(training_set_scaled[i+10, 0])
X_train, y_train = np.array(X_train), np.array(y_train)

X_train = np.reshape(X_train, (X_train.shape[0], X_train.shape[1], 1))



#準備testing set
X_test = []
y_test = []
for i in range(2035, len(training_set_scaled)-10):
    X_test.append(training_set_scaled[i-60:i, 0])
    y_test.append(training_set_scaled[i+10, 0])
X_test, y_test = np.array(X_test), np.array(y_test)

X_test = np.reshape(X_test, (X_test.shape[0], X_test.shape[1], 1))


#匯入Keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers import Dropout


#建立模型
regressor = Sequential()
regressor.add(LSTM(units = 50, return_sequences = True, input_shape = (X_train.shape[1], 1)))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50, return_sequences = True))
regressor.add(Dropout(0.2))
regressor.add(LSTM(units = 50))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = 1))
regressor.compile(optimizer = 'adam', loss = 'mean_squared_error', metrics=['mae'])
train_history=regressor.fit(X_train, y_train, validation_split=0.1, epochs = 10, batch_size = 50)


#視覺化訓練過程
import matplotlib.pyplot as plt
def show_train_history(train_history,train,validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    plt.title('Train History')
    plt.ylabel(train)
    plt.xlabel('Epoch')
    plt.legend(['train','validation'],loc='upper left')
    plt.show()



#測試model的準確度
score=regressor.evaluate(X_test,y_test)
print('mae=',score[1])


#進行預測
prediction=regressor.predict(X_test)

#將預測結果轉換成原始座標
prediction_t = sc.inverse_transform(prediction.reshape(-1,1))
y_test_t = sc.inverse_transform(y_test.reshape(-1,1))

#畫出實際股價和預測股價走勢
plt.plot(np.arange(len(y_test)),y_test_t,label='real')
plt.plot(np.arange(len(prediction)),prediction_t,label='prediction')
plt.title('台積電TSM ADR股價走勢模擬')
plt.legend()

這個預測的趨勢看似準確,但實際上這些預測仍然有用到training set之後的股價來當作是input,此外這個模型只用到開盤價的資訊;實務上當日成交量,法人籌碼也都會對於未來股價有所影響,因此日後的model會加入當日成交量,法人籌碼的特徵值

以下我們用yahoo_historical取得的所有表單欄位(Open, High, Low, Close, Adj Close, Volume)當作是Features用來建立model,
經過100次訓練後得到的訓練過程如下:
用六個Features預測股價的擬合程度較用單純開盤價的擬合程度差

三倍槓桿和一倍槓桿的長期定期定額報酬率分析

  以下是中國,美國股票債卷的三倍槓桿和一倍槓桿ETF分析.可以發現,三倍槓桿在下跌時期的跌幅遠比一倍槓桿的多 .且從時間軸來看,三倍槓桿由於下跌力道較強,因此會把之前的漲幅都吃掉,所以對於長期上身的市場,例如美國科技股,由於上升時間遠比下跌時間長,所以持有TQQQ的長期回報率會...