超簡単Pythonで株価予測（TPOT 利用）自動機械学習（AutoML）-ニューラルネットワーク-

2021年7月4日 22:10

PythonでTPOTを利用して翌日の株価の上下予測を超簡単に自動機械学習（AutoML）ニューラルネットワーク

1. ツールインストール

$ pip install tpot yfinance pytorch torchvision

2. ファイル作成

pred.py

from tpot import TPOTClassifier
import yfinance as yf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score

df = yf.download("AAPL", start="2010-11-01", end="2020-11-01")
df["Diff"] = df.Close.diff()
df["SMA_2"] = df.Close.rolling(2).mean()
df["Force_Index"] = df.Close * df.Volume
df["y"] = df["Diff"].apply(lambda x: 1 if x > 0 else 0).shift(-1)
df = df.drop(
 ["Open", "High", "Low", "Close", "Volume", "Diff", "Adj Close"],
 axis=1,
).dropna()
# print(df)
X = StandardScaler().fit_transform(df.drop(["y"], axis=1))
y = df["y"].values
X_train, X_test, y_train, y_test = train_test_split(
 X,
 y,
 test_size=0.2,
 shuffle=False,
)
tpot = TPOTClassifier(
 config_dict='TPOT NN',
 template='Selector-Transformer-PytorchLRClassifier',
 generations=10,
 population_size=10,
 verbosity=2,
)
tpot.fit(
 X_train,
 y_train,
)
y_pred = tpot.predict(X_test)
print(accuracy_score(y_test, y_pred))

3. 実行

$ python pred.py

[*********************100%***********************]  1 of 1 completed

Generation 1 - Current best internal CV score: 0.5263360616273471

Generation 2 - Current best internal CV score: 0.5263360616273471

Generation 3 - Current best internal CV score: 0.5263360616273471

Generation 4 - Current best internal CV score: 0.5263360616273471

Generation 5 - Current best internal CV score: 0.5283224078120563

Generation 6 - Current best internal CV score: 0.5283224078120563

Generation 7 - Current best internal CV score: 0.5283224078120563

Generation 8 - Current best internal CV score: 0.5283224078120563

Generation 9 - Current best internal CV score: 0.5283224078120563

Generation 10 - Current best internal CV score: 0.5283224078120563

Best pipeline: PytorchLRClassifier(ZeroCount(SelectFromModel(input_matrix, criterion=entropy, max_features=0.8500000000000001, n_estimators=100, threshold=0.2)), batch_size=16, learning_rate=0.001, num_epochs=15, weight_decay=0)

0.503968253968254

以上、超簡単！

4. 結果

同じデータ、特徴量で、計算した結果、PyCaret・PyCaret(bagging)・PyCaret(voting)・PyCaret(stacking)・TPOT・TPOT(NN)・Auto-sklearn・AutoGluon・AutoKeras・FLAMLのうちTPOTが最も良いという事に

PyCaret            0.5178571428571429
PyCaret(bagging)   0.5496031746031746
PyCaret(voting)    0.5535714285714286
PyCaret(stacking)  0.5496031746031746
TPOT               0.5555555555555556
TPOT(NN)           0.503968253968254
Auto-sklearn       0.5198412698412699
AutoGluon          0.5496031746031746
AutoKeras          0.4861111111111111
FLAML              0.5277777777777778

5. 参考

この記事が気に入ったらサポートをしてみませんか？