多个ETF行情数据join,协方差 相关系数计算
所属分类 quant
浏览量 471
数据格式
time,open,close,low,high,volume
2019-08-12,3.708,3.756,3.701,3.758,180115577
import pandas as pd
sh510300file = "/Users/dugang/data/sh510300.csv"
sh510900file = "/Users/dugang/data/sh510900.csv"
sh513180file = "/Users/dugang/data/sh513180.csv"
sh588000file = "/Users/dugang/data/sh588000.csv"
sz159949file = "/Users/dugang/data/sz159949.csv"
sh518880file = "/Users/dugang/data/sh518880.csv"
sh510300df = pd.read_csv(sh510300file)
sh510900df = pd.read_csv(sh510900file)
sh513180df = pd.read_csv(sh513180file)
sh588000df = pd.read_csv(sh588000file)
sz159949df = pd.read_csv(sz159949file)
sh518880df = pd.read_csv(sh518880file)
根据time列合并join , 保留time和 close 列
df = pd.merge(sh510300df[["time","close"]],sh588000df[["time","close"]],on='time')
df.columns=["time","sh510300","sh588000"]
df = pd.merge(df,sh510900df[["time","close"]],on='time')
df.columns=["time","sh510300","sh588000","sh510900"]
df = pd.merge(df,sh513180df[["time","close"]],on='time')
df.columns=["time","sh510300","sh588000","sh510900","sh513180"]
df = pd.merge(df,sz159949df[["time","close"]],on='time')
df.columns=["time","sh510300","sh588000","sh510900","sh513180","sz159949"]
df = pd.merge(df,sh518880df[["time","close"]],on='time')
df.columns=["time","sh510300","sh588000","sh510900","sh513180","sz159949","sh518880"]
df.head()
time sh510300 sh588000 sh510900 sh513180 sz159949 sh518880
0 2021-05-25 5.327 1.408 1.150 1.003 1.398 3.812
1 2021-05-26 5.328 1.404 1.156 1.004 1.385 3.852
2 2021-05-27 5.347 1.437 1.153 1.005 1.396 3.829
3 2021-05-28 5.332 1.430 1.150 0.991 1.403 3.800
4 2021-05-31 5.338 1.484 1.149 0.998 1.443 3.828
df.describe()
# 计算协方差矩阵
df.cov()
# 计算相关系数矩阵
df.corr()
sh510300 sh588000 sh510900 sh513180 sz159949 sh518880
sh510300 1.000000 0.943971 0.905710 0.930391 0.955516 -0.703400
sh510300 和 sh518880(黄金ETF) 相关系数 -0.703400 负相关
# 只取今年以来的数据
df2 = df[df['time'] >= '2023-01-01']
# 取第一列 到 最后一列 ,去掉time列
df3 = df2.iloc[:,1:]
# 取第一行
firstRow = df3.iloc[0]
# 每一列都除以 第一个值
df4 = df3.div(firstRow)
sh510300 sh588000 sh510900 sh513180 sz159949 sh518880
393 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
394 1.001266 0.993157 1.027127 1.043321 0.988338 1.003762
395 1.020258 1.002933 1.043157 1.059567 1.023324 1.003511
396 1.024310 1.008798 1.038224 1.043321 1.034985 0.996238
397 1.030641 1.008798 1.048089 1.061372 1.040816 1.007023
# 绘图展示 对比
df4.plot()
上一篇
下一篇
pandas dataframe 计算收益率
java量化交易技术资料
开源授权协议
《趋势永存:打败市场的动量策略》笔记
回归问题的评价指标和知识点
海龟交易系统