Monday, June 19, 2023

Python: Hankel Matrices H0, H1, Generalised Eigenvalue of H0 and H1, and Singular Value Decomposition (SVD) used for the Matrix Pencil Method (MPM) and Singular Spectrum Analysis (SSA)

 


This process is the part of the Python code for Prony method introduced in the last post. The process of finding Hankel matrices, Generalised Eigenvalues, and Singular Value Decomposition (SVD) are the essential processes for both the modern Prony method (Matrix Pencil Method (MPM) and Singular Spectrum Analysis (SSA). 

 SSA is the non-parametric method which is said to be influenced by Prony method. For financial time series analyses, SSA is popularly used nowadays. SSA is better at capturing non-linearly time series with erratic fluctuations. SSA is popular due to the multi-resolution decomposition of time series into time trends and volatilities. 

The following Python codes operates the following actions:

- Downloads the foreign exchange rates of Euro based on USD

- Generating a time series variable EUR/USD

- Creating Hankel matrices H0 and H1 based on this time series

- Finding Generalised Eigenvalue and Eigenvectors of H0 and H1

- Singular Value Decomposition (SVD) of the Hankel matrix H0

 

# Execute this cell to have access to the API of Frankfurter App
import requests
import json

# importing necessary tools
import matplotlib.pyplot as plt
import networkx as nx
import numpy as np
import pandas as pd
import random
import statistics
import scipy.linalg
from numpy import linalg as LA
from scipy.stats import qmc
from scipy.optimize import newton
from scipy import stats
import statsmodels.formula.api as sm

import math
import cmath

from scipy.linalg import hankel

import scipy.odr
from scipy import odr

# Showing the plots bigger
plt.rcParams["figure.figsize"] = (12,6)

# defining complex number
i=complex(0,1)
i

# Connecting to the API https://www.frankfurter.app/docs/ to get the EUR/USD rates
url = "http://api.frankfurter.app/1999-01-01.."
# Write your code here
resp = requests.get(url,{'to':'USD'})
#print(resp.url)
#print(resp.status_code)
#print("Type:", type(resp.content))
#print(resp.content)

print("Type:", type(resp))

# Preparing for the right encoding
import ast
byte_str = resp.content
dict_str = byte_str.decode("UTF-8")
mydata = ast.literal_eval(dict_str)
print("Type:", type(mydata))

#mydata #show the accessed data
EURUSD=(mydata["rates"])
#EURUSD

# Sorting the data into the Python dictionary form
Dates_EURUSD=[]
Rates_EURUSD=[]

for key, value in EURUSD.items():
    Dates_EURUSD.append(key)
    for k,v in value.items():
        Rates_EURUSD.append(v)

EURUSD=dict(zip(Dates_EURUSD,Rates_EURUSD))
#EURUSD

#Rates_EURUSD
print("Length of data:",len(Rates_EURUSD))

# Converting the rate to natural logarithm
ln_Rates_EURUSD=np.log(Rates_EURUSD)
ln_Rates_EURUSD
#plt.plot(ln_Rates_EURUSD)   # Uncomment to show plot!

# Creating the log-difference (Daily return) of EUR/USD
d_ln_Rates_EURUSD=np.diff(np.log(Rates_EURUSD))
len(d_ln_Rates_EURUSD)
#plt.plot(d_ln_Rates_EURUSD)   # Uncomment to show plot!

# Obtaining the length which is half of the total length of this variable.
# This is needed for creating the matrix of this variable i.e. n/2 * n/2 matrix
X=d_ln_Rates_EURUSD.copy()
d=(int(len(X)/2-len(X)%2*0.5)) # Subtracting the remainder if it is an odd number
d


f=d_ln_Rates_EURUSD.copy()
n=len(f)
M=100  # sampling rate.
# M is supposed to be twice of the actual signal frequency.
# But, we have never known the true frequency of the financial data.
# So, it is assumed to be 100 because it seems to provide an appropriate answer
print(n) # Checking the total length of this variable.

# Creating Hankel matrix which is needed to
def mat_ge(samples):
    n_samples = len(samples)
    d = int(n_samples/2)
    H0=hankel(samples[0:-1])[0:d,0:d]
    H1=hankel(samples[1:])[0:d,0:d]
    return H0, H1
H0, H1 =mat_ge(f)
H=(H0,H1)

for h in range(len(H)):
    print(f'\n\n \033[1m Hankel Matrix H{h} \033[0m')
    panda_df = pd.DataFrame(data = H[h])
    display(panda_df)

# Finding the Generalised Eigenvalue of these Hankel matrices H0 and H1
print('\n\n Finding the Generalised Eigenvalue and Eigenvectors of these Hankel matrices H0 and H1 ')
Eig = scipy.linalg.eig(H0, H1)
Evalue= Eig[0]
Evectors = Eig[1:]
Evalue_real=Evalue.real
print('\n\n \033[1m Eigenvalue \033[0m')
panda_df = pd.DataFrame(data = Evalue)
display(panda_df)
print('\n\n \033[1m Eigenvectors \033[0m')
display(Eig[1:])
   
# Showing Verndermonde Matrix
Veig=np.transpose(np.vander(Evalue, N=None, increasing=True))
#print('\n\n Vandermonde Matrix of the eigenvalues')
#panda_df = pd.DataFrame(data = Veig)
#display(panda_df)


# Defnining the function plotting Singular Value Decomposition (SDV)
# This apprximate the number of the physical/true signals i.e. the number of true alpha s and phi s
def plot_SVD(A):
    SDV=LA.svd(A, full_matrices=True, compute_uv=True, hermitian=False)[1]
    Ln_SDV=[math.log(SDV[k]) for k in range(len(SDV)) ]
    plt.plot(Ln_SDV)
    plt.title('Singular Values')
    plt.xlabel('')
    plt.ylabel('Natural Log')
    plt.show
print('\n\n Singular Value decomposition (SVD) of the Hankel matrix H0')
plot_SVD(H0)

 

 

P.S. This following website shows the useful code to simulate Singular Spectrum Analysis (SSA) with Python

Introducing SSA for Time Series Decomposition
Python · MotionSense Dataset : Smartphone Sensor Data - HAR
https://www.kaggle.com/code/jdarcy/introducing-ssa-for-time-series-decomposition