Matrix Pencil (MP) is being tested with a stock market dataset. The time series datasets include market indices such as NASDAQ and individual company stocks, with Apple Inc. used as an example. MP reconstructs the smoothed time series of the index and stock values by extracting noise and errors. The advantage of MP lies in its ability to forecast the direction of these values, accounting for volatility. Additionally, it provides future predictions of the seasonal behaviour of these time series.
The Python code I developed is designed to allow users to select the index and stock to analyse by simply changing their ticker symbols at the start of the code. It is programmed to call the Yahoo Finance API to automatically download the relevant dataset. MP analyses and estimates the index and stock values independently. The final graph displays the percentage difference between the stock value and the index value, as determined through MP analysis.
The code includes the noise-filtering method Cadzow iteration to aid in identifying the number of physical terms. Filtering a time series using the Cadzow iteration yields a smoother, more consistent estimate with reduced fluctuations. This improvement also provides a clearer projection of future trends, helping investors assess whether a stock is outperforming or underperforming relative to the market index.
At this stage of the experiment, I have implemented a moving window of 60 business days (30 + 30 days, equivalent to the length of a fiscal quarter) to generate future forecasts derived from one year’s worth of input data from the preceding time interval. Unfortunately, I was unable to find suitable code to save the results as a video directly from the Jupyter Notebook environment.
The mean relative error of the future forecast compared to the actual values remains below 0.05 for market indices such as NASDAQ and S&P 500. However, for individual company stocks, the values need to be natural-logarithmically transformed before analysis to maintain a relative error below 0.05.
Overall, the results successfully demonstrate the utility of this tool in providing accurate future price predictions.
# importing necessary tools for math and plot
# For mathematics
import numpy as np
import random
import statistics
import scipy.linalg
from numpy import linalg as LA
from scipy.stats import qmc
from scipy import stats
import statsmodels.formula.api as sm
import math
import cmath
from scipy.linalg import hankel
import scipy.odr
from scipy import odr
# defining imaginary number
i=complex(0,1)
# For graphing
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import networkx as nx
import pandas as pd
from pandas.tseries.offsets import *
# For plotting with multiple axis with different scales
# https://stackoverflow.com/questions/9103166/multiple-axis-in-matplotlib-with-different-scales
from mpl_toolkits.axes_grid1 import host_subplot
import mpl_toolkits.axisartist as AA
# Inline animations in Jupyter
# https://stackoverflow.com/questions/43445103/inline-animations-in-jupyter
from matplotlib.animation import FuncAnimation
import matplotlib.animation as animation
import itertools # for joining lists inside a list
from itertools import count
# Showing the plots bigger
#plt.rcParams["figure.figsize"] = (18,9)
plt.rcParams["figure.figsize"] = (9,5)
#Import requests for image uploading
import requests
from PIL import Image
# Calling stock values from Yahoo finance API
# Ref. https://www.kaggle.com/code/alessandrozanette/s-p500-analysis-using-yfinance-data
## Calling API from Yahoo Finance: Uncomment the line "!pip install yfinance" if not done
# !pip install yfinance # Uncomment it to run if not installed yet in a different cell separated from this one.
import yfinance as yf
%config InlineBackend.figure_format='retina'
import warnings
warnings.filterwarnings("ignore")
" ######### The modern Prony's method: Matrix Pencil (MP) ######### "
## Defininig the functions for the exponential analysis and reconstruction
# Displaying an image file showing Matrix Pencil algorithm
url = "https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhNIp8sXe8f37WJIGxlJKcHw2eS1rnqpKknG8PoOHWNDvK5PvxYLtj6IFkWIwhuWlP7BJUSICAp4njNm32wTokNBk4gb0SuWYIiusTM7hXVOa4QR1oJr1Or2Y9qoD-yddHF8ZF_rUV5opFwSvKWG_4mJFcKAIsTx0mnUA_lnIITS68h8MOQoEWC/w465-h554/MatrixPencil_algorithm.jpg"
img_data = requests.get(url, stream = True).raw
display(Image.open(img_data))
# Definining the function of Matrix Pencil (MP) method,
# one of Prony-like methods.
def MP(samples,nterms):
# Firstly creating a rectangular hankel matrix of the direct imput of time series
n_samples=len(samples)
ncols_value=1/3
ncols=int(n_samples*ncols_value)
nrows=int(n_samples+1-ncols)
if nterms>math.floor(n_samples*ncols_value):
if min(ncols,nrows)==ncols: #Increase the number of columns
d=nterms-ncols
ncols=ncols+d
nrows=nrows-d
elif min(ncols,nrows)==nrows: #Increase the number of rows
d=nterms-nrows
ncols=ncols-d
nrows=nrows+d
H=hankel(samples)[0:nrows,0:ncols]
# Secondly, extracting the unitary matrix (left eigenvectors) of the hankel matrix
SVD_H=LA.svd(H, full_matrices=True, compute_uv=True, hermitian=False)
U=SVD_H[0]
U0=U[0:len(U)-1,0:nterms]
U1=U[1:len(U),0:nterms]
# Generating (snall) omega, the periodic time-steps.
n=math.floor(len(samples)/2)
_2n_1=(2*n-1)+1
M=1 # Sampling rate:
#According to Shannon-Nyquist sampling theorem, the sampling rate must be twice of the maximum signal length.
#However, M=1 the financial data not based on a particular signalling cycle model.
#omega = [2*math.pi/M*(_0__2n_1) for _0__2n_1 in range(_2n_1)] # Signal processing time steps
omega = np.arange(_2n_1) # using a simple discrete time steps Updated 17 07 2024
# Obtaining (big) Omega containing the frequencies and the time step
# Generalised engenvalues are decomposed by eig(U0\U1)
# https://fr.mathworks.com/help/matlab/ref/eig.html
OMEGA= scipy.linalg.eig(np.dot(np.linalg.pinv(U0),(U1)))[0]
# frequencies and damping factors
# phi_rec = [cmath.log(OMEGA[k])*M/(2*math.pi) for k in range(len(OMEGA))] # if using the signal process time step
phi_rec = [cmath.log(OMEGA[k]) for k in range(len(OMEGA))] # if using a simple discrete time steps Updated 17 07 2024
# Retrieving alpha
V=np.transpose(np.vander(OMEGA, N=len(samples), increasing=True))
# https://stackoverflow.com/questions/11479064/multiple-linear-regression-in-python
y = samples
X = V
coeffs = np.linalg.lstsq(X, y, rcond=None)[0]
alpha_rec=coeffs
alpha_rec=alpha_rec.tolist()
#shorting alphas and phs according to absolute values of alphas
abs_alpha=[abs(alpha_rec[k]) for k in range(len(alpha_rec))]
#alpha_rec = [abs_alpha for _,abs_alpha in sorted(zip(abs_alpha,alpha_rec), reverse=True)]
#phi_rec = [abs_alpha for _,abs_alpha in sorted(zip(abs_alpha,phi_rec), reverse=True)]
return alpha_rec,phi_rec
# Function for generating time series based on alphas and phis
def Exp(amp,freq,samples):
n=math.floor(len(samples)/2)
y=[]
# Generating x, the periodic time-steps.
_2n_1=(2*n-1)+1
omega = np.arange(_2n_1) # using a simple discrete time steps
for omg in range(len(omega)):
yi=0
for ap in range(len(amp)):
yi+=amp[ap]*cmath.exp(freq[ap]*omega[omg])
y.append(yi)
return y
# Displaying alphas and phis in a table
def PametersDisplay(AlphasPhis,nterms):
Header=['alphas','phis']
List=AlphasPhis
Table=dict(zip(Header,List))
panda_df = pd.DataFrame(data = Table)
pd.set_option('display.max_rows',nterms)
panda_df.head(nterms)
display(panda_df)
' ######### Condition Number of the Vandermonde system ######### '
# Producing a Vandermonde Matrix of Omega based on the MP method
def CondNo_VanderM_MP(samples,nterms):
# Firstly creating a rectangular hankel matrix of the direct imput of time series
n_samples=len(samples)
ncols_value=1/3
ncols=int(n_samples*ncols_value)
nrows=int(n_samples+1-ncols)
if min(ncols,nrows)==ncols: #Increase the number of columns
d=nterms-ncols
ncols=ncols+d
nrows=nrows-d
elif min(ncols,nrows)==nrows: #Increase the number of rows
d=nterms-nrows
ncols=ncols-d
nrows=nrows+d
H=hankel(samples)[0:nrows,0:ncols]
# Secondly, extracting the unitary matrix (left eigenvectors) of the hankel matrix
SVD_H=LA.svd(H, full_matrices=True, compute_uv=True, hermitian=False)
U=SVD_H[0]
U0=U[0:len(U)-1,0:nterms]
U1=U[1:len(U),0:nterms]
# Generating (snall) omega, the periodic time-steps.
n=math.floor(len(samples)/2)
_2n_1=(2*n-1)+1
M=1 # Sampling rate:
#According to Shannon-Nyquist sampling theorem, the sampling rate must be twice of the maximum signal length.
#However, M=1 the financial data not based on a particular signalling cycle model.
omega = [2*math.pi/M*(_0__2n_1) for _0__2n_1 in range(_2n_1)]
# Obtaining (big) Omega containing the frequencies and the time step
# Generalised engenvalues are decomposed by eig(U0\U1)
# https://fr.mathworks.com/help/matlab/ref/eig.html
OMEGA= scipy.linalg.eig(np.dot(np.linalg.pinv(U0),(U1)))[0]
# frequencies and damping factors
phi_rec = [cmath.log(OMEGA[k])*M/(2*math.pi) for k in range(len(OMEGA))]
# Vandermonde Matrix
VandermondeMatrix=np.transpose(np.vander(OMEGA, N=n_samples, increasing=True))
# and then its condition number
ConditionNumber=LA.cond(VandermondeMatrix)
return ConditionNumber
# Finding maximum number of the terms referring to the condition number
def max_NTERMS(f_t):
N=len(f_t)
ln_CondNos=[]; NTERMS_Candidates=[]
for n in range((int(math.floor(N/3))),(int(math.floor(N/2)))+1):
NTERMS_Candidates.append(n)
ln_CondNos.append(math.log(CondNo_VanderM_MP(f_t,n)))
#plt.plot(NTERMS_Candidates,ln_CondNos)
return NTERMS_Candidates[np.max(np.where(np.array(ln_CondNos)< 1.5*statistics.mean(np.array(ln_CondNos)))) ]
' ######### Singular Value Decomposition (SVD) ######### '
# plot logged Singular Valuesand their log difference
# and returns the odd number indices of the terms representing the physical components
# for a reference to check how many exponential components can be grouped together.
# ref. https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.hankel.html
def plot_lnSVD_logdiff(samples,PrintGraph):
# calculating logged SVs and their log difference
N=math.floor(len(samples))
H=hankel(samples)[0:math.floor(N*2/3),0:math.floor(N*1/3)] # rectangular matrix
#H=hankel(samples)[0:math.floor(N*1/2),0:math.floor(N*1/2)] # square matrix
SVD=LA.svd(H, full_matrices=True, compute_uv=True, hermitian=False)[1]
ln_SVD=[math.log(SVD[k]) for k in range(len(SVD)) ]
d_ln_SVD=[abs(ln_SVD[k+1]-ln_SVD[k]) for k in range(len(SVD)-1) ];
# print(len(ln_SVD)) # for debug
if PrintGraph==1:
# Plotting two line plots with each different scale
print("Singular Value Decomposition (SVD)")
plt.figure()
fig, ax1 = plt.subplots()
color = 'tab:green'
ax1.set_xlabel('time (s)')
ax1.set_ylabel('exp', color=color)
ax1.plot(ln_SVD, color=color)
ax1.tick_params(axis='y', labelcolor=color)
ax2 = ax1.twinx() # instantiate a second Axes that shares the same x-axis
color = 'tab:pink'
ax2.set_ylabel('sin', color=color) # we already handled the x-label with ax1
ax2.plot(d_ln_SVD, color=color)
ax2.tick_params(axis='y', labelcolor=color)
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show(block=False)
# Selecting the indices of the significant terms representing the physical components
# The number of indices has to be odd as the first term with its index no 0 is the real term constant
# and the rest complex terms are the pairs
EnumSVD = [list(enumerate(d_ln_SVD, start=1))[k] for k in range(len(d_ln_SVD))]
TermIndices=[]; Threshhold=statistics.mean(d_ln_SVD);#print(Threshhold)
for k in range(math.floor(len(d_ln_SVD)*8/10)):
#print(EnumSVD[k]) # for debug
if EnumSVD[k][1]>Threshhold: # !! ** Select the threshold of selecting terms ** !!
#print(EnumSVD[k]) # for debug
if EnumSVD[k][0]%2==0:
TermIndices.append(EnumSVD[k][0]+1)
else:
TermIndices.append(EnumSVD[k][0])
#TermIndices=(list(set(TermIndices)))
#print(TermIndices) # for debug
return TermIndices
' ######### Cadzow iteration ######### '
# Cadzow iteration to filter the noise terms to ease picking up the phisical component terms
def Cadzow_Iteration(f_t, PrintGraph):
N=len(f_t)
m=math.floor(N*2/3) # rows
H=hankel(f_t)[0:m,0:(N-m)]
NTRY=plot_lnSVD_logdiff(f_t,PrintGraph)[-1] # Obtaining the number of the columns to keep in Cadzof Iteration
tf=1; thresh=0.05;hmax=0.0005; CadzowILoopCount=0;
hmean=np.zeros(N)
while tf==1: # Loop while hmax>thresh i.e. until hmax<thresh
tf=0; # Set the boolen as false at the beginning of the loop
U,s,Wtrans = LA.svd(H, full_matrices=True, compute_uv=True, hermitian=False) # Decomposing Hankel to the eigentriple
S=np.diag(s) # Transforming an array of SVs to a diagonal matrix
# Cutting out the columns of the eigentriple
U=U[:,0:NTRY]
S=S[0:NTRY,0:NTRY]
Wtrans=Wtrans[0:NTRY,:]
H=np.dot(np.dot(U,S),Wtrans) # multiplying these eigentriples to recreate a matrix
# The following operation for Hankelising this matrix
# First block iteration: Denominator increasing
for k in range(N-m):
hmean[k]=0;
for l in range(k+1):
hmean[k]=hmean[k]+H[k-l,l]; #print(1, k, l, H[k-l,l], hmean[k])
hmean[k]=hmean[k]/(k+1) ;
hmax=0.0005;
for l in range(k+1):
hmax=max(hmax,abs(H[k-l,l]-hmean[k])/abs(hmean[k]));
H[k-l,l]=hmean[k]
#print(tf)
tf = tf or int(hmax>thresh);
# Second block iterlation: Denominator fixed
for k in range(N-m,m):
hmean[k]=0;
for l in range(N-m):
hmean[k]=hmean[k]+H[k-l,l];
hmean[k]=hmean[k]/(N-m) ;
hmax=0.0005;
for l in range(N-m):
hmax=max(hmax,abs(H[k-l,l]-hmean[k])/abs(hmean[k]));
H[k-l,l]=hmean[k]
tf = tf or int(hmax>thresh);
# Third block iteration
for k in range(m,N):
hmean[k]=0;
for l in range((N-m),(k-m),-1):
hmean[k]=hmean[k]+H[k-l,l-1];
hmean[k]=hmean[k]/(N-k) ;
hmax=0.0005;
for l in range((N-m),(k-m),-1):
hmax=max(hmax,abs(H[k-l,l-1]-hmean[k])/abs(hmean[k]));
H[k-l,l-1]=hmean[k];
tf = tf or int(hmax>thresh)
CadzowILoopCount=CadzowILoopCount+1
if PrintGraph==1:
print('Cadzow Loop Count',CadzowILoopCount)
#print(CadzowILoopCount)
# Obtaining a filtered time series data from Hankel matrix after Cadzow iteration
f_t_tilde=np.concatenate((H[0:m,1],H[m-1,0:(N-m)]))
# Finding #terms from Cadzow integrated series
NTERMS_list=plot_lnSVD_logdiff(f_t_tilde,PrintGraph)
NTERMS=NTERMS_list[-1]
return NTERMS
' ######### Graph output ######### '
# Displaying alphas and phis in a table
def PametersDisplay(AlphasPhis,nterms):
Header=['alphas','phis']
List=AlphasPhis
Table=dict(zip(Header,List))
panda_df = pd.DataFrame(data = Table)
pd.set_option('display.max_rows',nterms)
panda_df.head(nterms)
display(panda_df)
# Plotting two lines
def plot2(Xticks,f,f_hat,Title):
# generating the reconstructed data given alphas and phis
#print(Title)
xi = list(range(len(Xticks)))
plt.figure();
plt.plot(f, label="original", color='darkseagreen')
plt.plot(f_hat, label="reconstructed", color='Magenta')
plt.legend(loc="best")
plt.title(Title)
plt.xlabel('Weeks')
plt.ylabel('Price')
plt.xticks(xi, Xticks, rotation=90)
plt.show(block=False)
# Plotting two lines
def plot2_future(amp,freq,samples,NoFutureTimePoints,Title):
# Function for generating time series based on alphas and phis
N=len(samples)
N_new=N+NoFutureTimePoints
n=math.floor(N_new/2)
y=[]
# Generating x, the periodic time-steps.
_2n_1=(2*n-1)+1
omega = np.arange(_2n_1) # using a simple discrete time steps
for omg in range(len(omega)):
y_j=0
for ap in range(len(amp)):
y_j+=amp[ap]*cmath.exp(freq[ap]*omega[omg])
y.append(y_j)
# generating the reconstructed data given alphas and phis
#print(Title)
plt.figure();
plt.plot(samples, label="Original", color='darkseagreen')
plt.plot(y, label="Reconstructed", color='Magenta')
plt.legend(loc="upper right")
plt.title(Title)
plt.xlabel('Weeks')
plt.ylabel('Price')
plt.show(block=False)
return y
# Future prediction values
# Plotting two lines
def Future_Prediction_Values(amp,freq,samples,NoFutureTimePoints):
# Function for generating time series based on alphas and phis
N=len(samples)
N_new=N+NoFutureTimePoints
n=math.ceil(N_new/2)
f=[]
# Generating x, the periodic time-steps.
_2n_1=(2*n-1)+1
omega = np.arange(_2n_1) # using a simple discrete time steps
for omg in range(len(omega)):
f_j=0
for ap in range(len(amp)):
f_j+=amp[ap]*cmath.exp(freq[ap]*omega[omg])
f.append(f_j)
f_future=f[N:N_new];#print(len(f_future))
return f_future
' ######### Calling Yahoo Finance dataset with API ######### '
def Dataset(Ticker_stock, No_Years_Data):
# Parameter Adjustment
# Set Ticker Symbol
# e.g. Ticker="^GSPC" for S&P 500, Ticker="NDAQ" for NASDAQ, Ticker="TOPX" for TOPIX
# Ticker="AMZN" for Amazon.com Inc., "AAPL" for Apple Inc., Ticker="NTDOY" for NINTENDO
Ticker_stock=Ticker_stock # Ticker symbok for a stock
# Set the number of years to download data
No_Years_Data=No_Years_Data;
# Now, let's retrieve the data of the past x years using yfinance
Years="".join([str(No_Years_Data),'y']) # Set the number of years ! Adjust
y_t = yf.Ticker(Ticker_stock).history(period=Years) # Ticker Symbokl ! Adjust !
# Another way to dl data
# y_t = yf.download(Ticker_stock, start="2015-01-01", end="2023-12-31")
# Let's take a look at the called data
#display(y_t.tail())
# Change to list
# https://stackoverflow.com/questions/39597553/from-datetimeindex-to-list-of-times
t_j=y_t.index.tolist()
t_j=[t_j[j].strftime("%Y-%m-%d") for j in range(len(t_j))]
# change from numpy array to list
y_t=y_t["Close"].tolist()
# # Plotting the data y_t with its date t
# for k in range(No_Years_Data*4):
# Window=int(math.floor(len(t)/No_Years_Data)/4)
# St=k*Window; Ed=St+Window; print(St,Ed)
# plt.figure(); plt.plot(t[St:Ed],y_t[St:Ed]); plt.xticks(rotation=90); plt.show()
## Linear Interpolation of data
# https://stackoverflow.com/questions/2315032/how-do-i-find-missing-dates-in-a-list-of-sorted-dates
# https://stackoverflow.com/questions/13019719/get-business-days-between-start-and-end-date-using-pandas
t_raw=t_j.copy()
t_intpl = pd.date_range(start=t_raw[0], end=t_raw[-1], freq=BDay()) # Interpolating only for business days
y_raw=y_t.copy()
y_intpl = np.interp(pd.to_datetime(t_intpl).astype(int), pd.to_datetime(t_raw).astype(int), y_raw)
t_j=t_intpl.copy(); y_t=y_intpl.copy()
t_j=t_j.tolist()
t_j=[t_j[j].strftime("%Y-%m-%d") for j in range(len(t_j))]
return t_j, y_t
' ######### Perfect Reconstruction ######### '
def Prefect_Reconstructioin(t_j, y_t,Ticker_stock):
# Retriving frequency-terms (phis) and amplitudes (alphas) for the perfect reconstruction
alphas, phis=MP(y_t,max_NTERMS(y_t))
# Reconstructing the time
y_t_hat=Exp(alphas, phis,y_t)
# Showing two lines: the original and the reconstructed time series
plot2(t_j, y_t,y_t_hat,f"{Ticker_stock} : Original & Reconstruction")
return y_t_hat
' ######### Denoised Reconstruction ######### '
def Denoised_Reconstructioin(t_j, y_t,Ticker_stock):
# Retriving frequency-terms (phis) and amplitudes (alphas) for the perfect reconstruction
alphas, phis=MP(y_t,Cadzow_Iteration(y_t,0))
# Reconstructing the time
y_t_hat=Exp(alphas, phis,y_t)
# Showing two lines: the original and the reconstructed time series
plot2(t_j, y_t,y_t_hat,f"{Ticker_stock} : Original & Reconstruction")
return y_t_hat
# For the future prediction
def Future_Forecast(t_j, y_t,Ticker_stock,CadzowIf0OtherwiseManual):
# Inline animations in Jupyter
# https://stackoverflow.com/questions/43445103/inline-animations-in-jupyter
epsilon=[] # storing errors
N_anal=int(math.floor(len(t_j)/No_Years_Data)) # Analysis time period
Window=30; # Future time period
for b in range(math.floor(len(t_j)/Window)-1):
st_anal=0+Window*b; ed_anal=st_anal+N_anal; #print("Analysis:", st_anal,ed_anal)
st_pred=ed_anal+1; ed_pred=st_pred+Window; #print("Prediction:", st_pred,ed_pred)
if ed_pred+Window >= len(t_j):
break
plt.rcParams["animation.html"] = "jshtml"
plt.rcParams['figure.dpi'] = 150
fig, ax = plt.subplots()
x_value = []
y1_value = []
y2_value = []
epsilon_value=[]
count_ = count();
def animate(a):
counts=next(count_);
#Dates
x_value=t_j[st_pred+counts:ed_pred+counts].copy(); #print(x_value)
# Original time series
y1_value=y_t[st_pred+counts:ed_pred+counts].copy(); #print("y1_value: ",y1_value)
# Reconstructed time series
y0=y_t[st_anal+counts:ed_anal+counts].copy(); #print(st_anal+counts, ed_anal+counts)
if CadzowIf0OtherwiseManual==0:
nu=Cadzow_Iteration(y0,0); # Using Cadzow
#nu=plot_lnSVD_logdiff(y0,0)[-1]; print("nu:",nu) # Using the highest SV with high diff_SV
else:
nu=CadzowIf0OtherwiseManual
# nu=nu; #print('nu:',nu)
alphas, phis=MP(y0,nu)
y2_value=Future_Prediction_Values(alphas,phis,y0,Window);
y2_value=np.absolute(y2_value); #print("y2_value: ",(y2_value))
y2_value=y2_value[:len(y1_value)]
" Errors"
epsilon_value=[abs((y1_value[w]-y2_value[w])/y2_value[w]) for w in range(min(Window,len(y1_value)))] # Relative
#epsilon_value=[abs((y1_value[w]-y2_value[w])/y1_value[w]) for w in range(min(Window,len(y1_value)))] # Relative
#epsilon_value=[abs((y1_value[w]-y2_value[w])) for w in range(min(Window,len(y1_value)))] # Absolute
epsilon.append(epsilon_value)
epsilon_bar=statistics.mean(epsilon_value)
" Graph "
ax.cla()
ax.plot(x_value,y1_value, label="Original", color='darkseagreen')
ax.plot(x_value,y2_value, label="Reconstruction", color='Magenta')
plt.legend(loc="upper right")
plt.xticks(rotation=90)
plt.title(f'{Ticker_stock} : From {t_j[st_pred+counts]} to {t_j[ed_pred+counts]} with the average error {round(epsilon_bar*100,1)}%')
ax.set_xlim(0,Window)
ylim=y_t[st_pred:ed_pred+Window]
ax.set_ylim(min(ylim)*0.9,max(ylim)*1.1)
Animation=animation.FuncAnimation(fig, animate, frames=Window, interval = 500)
display(Animation)
# Separating figures
plt.show(block=False)
# Analysing errors
# joininig lists inside a list
epsilon=list(itertools.chain.from_iterable(epsilon))
print("Mean Relative Error:", statistics.mean(epsilon))
plt.figure(); plt.hist(epsilon, color = "lightgreen"); plt.show(block=False)
return epsilon
' ######### Testing with a synthetic signaling data ######### '
# Synthetic signaling data
def Synthetic(NoiseLevel):
def syn_exp(amp,freq,x):
t=len(amp)
z=[]
for omg in range(len(x)):
zi=0
for ap in range(len(amp)):
zi+=amp[ap]*cmath.exp(freq[ap]*x[omg])
z.append(zi)
return z
# signal Business Cycle 1
phi = [0.01, 0.3*i, 1*i, -1+3*i, -1-2*i, 0.00+0.5*i, 11*i, -9*i, 30*i]
alpha = [4, 3, 2, 2, 2, 1, 1, 1, 0.1]
nterm=len(phi)
# sampling rate
M = 100
# number of exponentials in the signal
n = 300
# f_syn: samples used by Prony's methhod
_2n_1=(2*n-1)+1
omega = [2*math.pi/M*(_0__2n_1) for _0__2n_1 in range(_2n_1)]
# Defininig Synch
z_syn = syn_exp(alpha, phi, omega)
# Defining the noise
noise=[np.random.normal(0) for k in range(len(z_syn))]
noise=noise/LA.norm(noise)
# eps: noise level, also try 10^(-0)
eps = 10^(NoiseLevel);
z = z_syn + eps*noise;
z = [ z[k].real for k in range(len(z))]
#z=z[10:-10]
z=[z[j]+100 for j in range(len(z))]
import datetime
t=[datetime.datetime.fromtimestamp(d*100000).date() for d in range(len(z))]
t=[t[j].strftime("%Y-%m-%d") for j in range(len(t))]
return t, z
t_z,z_j=Synthetic(3)
plt.plot(t_z,z_j, color='darkseagreen')
# Perfect reconstruction of the synthetic data
z_j_hat=Prefect_Reconstructioin(t_z, z_j,"Synthetic Data")
# Forecast
No_Years_Data=2
epsilon_z=Future_Forecast(t_z, z_j,"Synthetic Data",9) # manually setting as 9 terms
' ######### Calling dataset to analyse ######### '
# Example of NASDAQ
# Parameter Adjustment
# Set Ticker Symbol
# e.g. Ticker="^GSPC" for S&P 500, Ticker="NDAQ" for NASDAQ, Ticker="^N225" for Nikkei
# Ticker="AMZN" for Amazon.com Inc., "AAPL" for Apple Inc., Ticker="NTDOY" for NINTENDO
Ticker_stock="NDAQ" # Ticker symbok for a stock
# Set the number of years to download data
No_Years_Data=2;
# Output
t_j, y_t = Dataset(Ticker_stock, 2)
# Perfect reconstruction
y_t_hat=Prefect_Reconstructioin(t_j, y_t,Ticker_stock)
# Picking up the structural components for the future prediction
# SVD and Cadzow iteration
Cadzow_Iteration(y_t, 1)
# Denoised reconstruction: Picking up the structured compnents
y_y_hat=Denoised_Reconstructioin(t_j, y_t,Ticker_stock)
# Future forecast
epsilon=Future_Forecast(t_j, y_t,Ticker_stock,0) # Set 0 as using Cadzow iteration to find the number of the terms nu
## Example with Apple Inc.
# Parameter Adjustment
# Set Ticker Symbol
# e.g. Ticker="^GSPC" for S&P 500, Ticker="NDAQ" for NASDAQ, Ticker="^N225" for Nikkei
# Ticker="AMZN" for Amazon.com Inc., "AAPL" for Apple Inc., Ticker="NTDOY" for NINTENDO
Ticker_stock="AAPL" # Ticker symbok for a stock
# Set the number of years to download data
No_Years_Data=2;
# Output
t_j, y_t = Dataset(Ticker_stock, 2)
# Perfect reconstruction
y_t_hat=Prefect_Reconstructioin(t_j, y_t,Ticker_stock)
# # Future forecast
# SaveVideo=0
# epsilon=Future_Forecast(t_j, y_t,Ticker_stock,SaveVideo)
# Using a natural-logged data instead
ln_y_t = np.log(np.array(y_t))
# Denoised reconstruction: Picking up the structured compnents
y_y_hat=Denoised_Reconstructioin(t_j, ln_y_t,Ticker_stock)
# Future forecast
epsilon=Future_Forecast(t_j, ln_y_t,Ticker_stock,0) # Set 0 as using Cadzow iteration to find the number of the terms nu