Tuesday, December 26, 2023

Reimu Hakurei 博麗 霊夢、夢想天生 (Musou Tensei)

 





Hakurei Reimu - 博麗 霊夢, the Protagonist in Touhou project (Touhou 東方 project ), the #Shinto ist Shrine Maiden. 神道. Posing for her special move 夢想天生 (Musou-Tensei)

Tool used: CLIP STUDIO PAINT

 These illustrations are posted in my Pixiv page so please visit https://www.pixiv.net/en/users/32334195!

 





 

 

Thursday, December 21, 2023

Discussion on Prony-like method for financial data and its potential adoption to econometrics

1. Introduction of Prony-like method

In terms of analysing time series data exhibiting a periodic behaviour, many financial data analysts have adapted Fourier transfer. However, Fourier transfer has a critical disadvantage of dealing with the non-stable signal as shown in these graphs and failing to detect more than one signal frequencies close to each other.

There is an alternative called Prony-like method which adopts the signal processing method of invented by Baron Gaspard Clair François Marie Riche de Prony (22 July 1755 – 29 July 1839)

De Prony was the mathematician famous at the contemporary time period of when Fourier was alive. Fourier method has been popular for a long time period mainly because of the convenience with a limited computational capability. Nevertheless, the current advancement of computational mathematics has enabled to apply Prony method to the advanced mathematical data analyses.

Prony method is advantageous to detect various signal frequencies with their amplitude (coefficient) of a time series with a periodic behaviour. Through the time length, it can describe various waving formations of a time series. For example, the seasonality and the event risk can be detected by observing these varying trends.

There is also a signal processing method similar to but different from Prony-like method which is called Singular Spectrum Analysis (SSA).  The analogy of Prony-like method and SSA describes that they are like cousins. SSA does not retrieve parameters like Prony-like method while SSA is faster and less complicated to decompose periodic time series.


2. Mathematical problem statement of Prony-like method

The modern Prony-like method utilises linear algebraic methods to find the exponentials representing the frequency-components and then their amplitudes. Time series is transformed to a symmetric square matrix called Hankel matrix.

The exponential components are found from extracting the generalised eigenvalues of two shifted Hankel matrices. The natural-log transformed generalised eigenvalues are divided with the time step size. 

The amplitudes can be found by solving the Vandermonde system.


3. Incompatibility of Prony-lile method with economic and financial data

3.1. Economic and financial data not following Shannon-Nyquist sampling theorem 

Prony-like method is based on the exponential of complex number which requires the appropriate sampling rate to avoid the misjudgement of parameters. According to Shannon-Nyquist sampling theorem, the sampling rate must be twice of the maximum signal length. As shown in the graph, the misadjusted sampling rate fails to retrieve some complex exponentials.

In contrast, there is no signal period defined in economic and financial data. For example, the fluctuation of financial time series is so substantially high that it is nearly impossible to measure the precise signalling length of its periodic behaviour. Moreover, the time steps are not equidistributed e.g. the market opening hours, and then the uniform step size is very less likely to be defined.

In this case, it needs to ignore the issues with the sampling rate while analysing economic and financial time series.


3.2. Rational approximation not applied 

For the signal processing analyses, Prony-like method models both the physical signal components and the noise components instead of filtering the noise out. The rational behind it is to discover the precise number of the terms representing the physical signal components. The involvement of the faint terms and the cluster terms makes it more difficult to find the numbers of the terms and retrieve the parameters. 

A Padé approximation explains that the involvement of the sufficient number of the terms induces to more accurate detecting the number of the physical component terms as well as retrieving more accurate parameters.  As shown in the graph, by expanding the number of the terms, the gap between the denominator and the numerator of the rational function becomes smaller. Then, the accurate number of the physical component terms is likely to be revealed.


Nevertheless, my experiment of Prony-like method with financial data has resulted in the failure of applying this rational approximation theory. By substantially increasing the number of the terms, the reconstruction of time series with the retrieved parameters becomes less accurate. The cause of this failure is assumed to be the erratic nature of financial data which is unable to distinguish between the physical components and the noise components. Therefore, the noise components can be falsely counted as the physical terms so that it may need to rather filter the noise for analysing time series for economics and finance.

Involving the excess number of the noise term may conversely miscalculate the number of the necessary number of the terms used for retrieving the parameters. Then, it seems to claim for an alternative approach for the analysis of financial data for estimating the necessary number of the terms used for retrieving the parameters.


4. Specific approach for financial time series

4.1. Overcoming the problem of retrieving parameters 

I would like to propose a possible alternative method to overcome the problem of detecting the necessary number of the terms for the reconstruction: It can refer to the singular values which are the absolute values of the eigenvalues of a normal matrix. The number of non-zero singular values is equal to the rank of the matrix. The singular value decomposition (SVD) of the previously mentioned Hankel matrix generates the singular values inferring to approximate the physical component terms.


As shown in the graph, the singular values of financial time series do not indicate the clear cut between those with the high values and the low values. This indicates that there is no clear distinction between the physical component terms and the noise terms in financial time series. 

The alternative method is to focus on the derivative of these singular values. In terms of this data, the decrease in the singular values becomes significantly smaller when the number of the terms becomes higher than about 120. My method is to find the point where the difference between the singular value becomes 1. With reference to the number of the term for retrieving the parameters which this method suggests, a set of the parameters obtained by the reconstruction process with this number has provided relatively more fitting models than the other counterparts.

The perfect accuracy is sacrificed by means of this proposed method. However, the movement of financial time series is too erratic to derive a highly accurate estimation than time series of physical science and engineering. The objective for financial data is to offer the estimation as a benchmark for decision making even with a high margin of error. 


4.2.  Decomposition with useful parameters

This graph is the example of modelling the business cycle (e.g., Kondratiev and Elliott cycle). The raw cycle represents the financial market performance whereas the smoothed cycle represents the economic structural capacity.

The parameters for modelling the business cycle are selected from the reconstructed parameters with the number of terms referring to the size of the singular values. Firstly, the top 10 highest amplitudes are selected from the reconstruction process. Secondly, it omits those representing the upward trend with their frequency-components including only real number (no imaginary number). Thirdly, they are reordered in ascending order by means of the absolute value of the imaginary part of the frequency. This process is to distinguish between the business cycle part and the smoothed cycle part.

This process derived from Prony-like method is way more complicated than the one derived from SSA. At the same time, Prony-like method offers the retrieved parameters of the cycle volatility which SSA does not offer. For example, the analysis based on the separated time line such as the yearly analysis provides the risk management measure referring to the changing volatility magnitude. Prony-like method tends to more clearly indicate the precautionary risk measure with the retrieved parameters. 

Overall, although the adaptation Prony-like method to the analysis of financial data and econometric still needs to be carefully scrutinised, it has a beneficial potential for these useful applications.


5. SWOT Analysis




Wednesday, December 20, 2023

Satori and Koishi Komeiji @ Touhou Project

 

Tool used: Clip Studio Paint 

My illustration is really influenced by the Japanese girl's comic style, I have realised myself while drawing the Komeiji sisters .

These illustrations are posted in my Pixiv page so please visit https://www.pixiv.net/en/users/32334195!    






Tuesday, December 12, 2023

SSA attempt functioning: The main SSA codes copied from StackExchange

  

The synthetic data imitating the business cycle is my original although the functions of generating the complex time series refers to my supervisor's work.

The set of Python codes implementing Singular Spectrum Analysis is directly copied from "Singular spectrum analysis and their "eigentriplets" (Stack Exchange). This SSA method is almost identical to the one based on MATLAB "Singular Spectrum Analysis - Beginners guide" (MathWorks) which I attempted to duplicate with Python. 

This set of Python codes succeeds in the reconstruction component (RC) as shown in the graph.


# importing necessary tools
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import seaborn as sns
import networkx as nx
import numpy as np
import pandas as pd
import random
import statistics
import scipy.linalg
from numpy import linalg
from scipy import stats
import statsmodels.formula.api as sm
import math
import cmath
from scipy.linalg import hankel

# Showing the plots bigger
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (20,10)


## Creating a synthetic time series imitating the business cycle

# Defining the function for the synthetic complex exponential  
def syn_exp(amp,freq,x):
    t=len(amp)
    y=[]
    for omg in range(len(x)):
        yi=0
        for ap in range(len(amp)):
            yi+=amp[ap]*cmath.exp(freq[ap]*x[omg])
        y.append(yi)
    return y

# Defining complex number
i=complex(0,1)
i


# Creating the synthetic data imitating the business cycle
phi = [0.03, 3*i, 10*i, -1+30*i, -1-20*i, 0.01+5*i];
alpha = [4, 3, 2, 2, 2, 1];

# sampling rate
S = 100

# number of exponentials in the signal
n = 600

# f_syn: samples used by Prony's methhod
_2n_1=(2*n-1)+1
omega = [2*math.pi/S*(_0__2n_1) for _0__2n_1 in range(_2n_1)]

# Defininig the synthetic time series
f_syn = syn_exp(alpha, phi, omega)

# Defining the noise
noise=[np.random.normal(0) for k in range(len(f_syn))]
noise=noise/linalg.norm(noise)

# eps: noise level, also try 10^(-0)
eps =  10^(-1);
#eps = 0

# f = f_syn added with random noise
f = f_syn + eps*noise;
f

# Re-defining the time series and its length
s=f;
N=len(f)


## The SSA starts
print("The following refers to https://stats.stackexchange.com/questions/544105/singular-spectrum-analysis-and-their-eigentriplets")
print("The SSA method of this set of Python codes is identical to the embedded covariance method of https://uk.mathworks.com/matlabcentral/fileexchange/58967-singular-spectrum-analysis-beginners-guide")

# Getting window length L and lagged length K
L = 500-1
K = N - L + 1

# Constructing the time lagged Hankel matrix
X=np.zeros((K,L))
for m in range (0, L):
    X[:,m] = s[m:K+m]
   
# Trajectory matrix
Cemb = np.dot(X.T,X)/K

# Eigen decomposition
eigenValues, eigenVectors = np.linalg.eig(Cemb)
idx = eigenValues.argsort()[::-1]  
eigenValues = eigenValues[idx]
eigenVectors = eigenVectors[:,idx]

# Vectors of Principal Components
PC = np.dot(X,eigenVectors)

# Pre-allocating Reconstructed Component Matrix
RC = np.zeros((N, L))
# Reconstruct the elementary matrices without storing them
for k in range(L):
    myBuf = np.outer(PC[:,k], eigenVectors[:,k].T)
    myBuf = myBuf[::-1]
    RC[:,k] = [myBuf.diagonal(j).mean()\
               for j in range(-myBuf.shape[0]+1, myBuf.shape[1])]
       
# # First 6 RC
# fig, ax = plt.subplots(3,2)
# ax = ax.flatten()
# for k in range (0, 6):
#     ax[k].plot(RC[:,k])
#     ax[k].set_title(str(i))
# plt.tight_layout()


# Plotting a graph comparison
RawSSA=RC[:,0]
for k in range(L-1):
    RawSSA=np.add(RawSSA,RC[:,k+1])
SCs=10 # Smoothed cycle components
SmoothedSSA=RC[:,0]
for k in range(SCs-1):
    SmoothedSSA=np.add(SmoothedSSA,RC[:,k+1])
plt.subplot(2,1,1)
plt.title("Original vs. Reconstructed time series")
plt.plot(s[:])
plt.plot(RawSSA)
plt.subplot(2,1,2)
plt.title("Reconstructed raw vs. smoothed time series")
plt.plot(RawSSA)
plt.plot(SmoothedSSA)


 

Monday, December 11, 2023

Marisa Kirisame and Master Spark

 

Special tribute to the No.1 girl in 2023! I have painted Marisa Kirisame with Master Spark! My favourite special move in Touhou Project! 

Tool used: Clip Studio Paint 

These illustrations are posted in my Pixiv page so please visit https://www.pixiv.net/en/users/32334195!    


 





 

Sunday, December 03, 2023

SSA attempt: Duplicating a MATLAB example with Python

 


 The following codes refer to the MATLAB codes displayed in https://uk.mathworks.com/matlabcentral/fileexchange/58967-singular-spectrum-analysis-beginners-guide

 There are several algorithm mismatches between Python functions and MATLAB functions which cause the malfunctioning of generating the components as the same as what have been generated in MATLAB.

It has to admit that this is incomplete. However, this attempt provides the pathway to experiment of implementing Singular Spectrum Analysis (SSA) which is a relatively newly introduced time series analysis with Python.

SSA itself has so many variations that there does not seem to be any one solid method. There are several varying methods of SSA for each different programming language and each platform.

The following simulation is based on my synthetic data imitating the business cycle: 

# importing necessary tools
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import networkx as nx
import numpy as np
import pandas as pd
import random
import statistics
import scipy.linalg
from numpy import linalg
from scipy import stats
import statsmodels.formula.api as sm
import math
import cmath
from scipy.linalg import hankel

# Showing the plots bigger
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (20,10)

# Defining the function for the synthetic complex exponential  
def syn_exp(amp,freq,x):
    t=len(amp)
    y=[]
    for omg in range(len(x)):
        yi=0
        for ap in range(len(amp)):
            yi+=amp[ap]*cmath.exp(freq[ap]*x[omg])
        y.append(yi)
    return y

# Defining complex number
i=complex(0,1)
i

# Creating the synthetic data imitating the business cycle
phi = [0.03, 3*i, 10*i, -1+30*i, -1-20*i, 0.01+5*i];
alpha = [4, 3, 2, 2, 2, 1];

# sampling rate
S = 100

# number of exponentials in the signal
n = 600

# f_syn: samples used by Prony's methhod
_2n_1=(2*n-1)+1
omega = [2*math.pi/S*(_0__2n_1) for _0__2n_1 in range(_2n_1)]

# Defininig the synthetic time series
f_syn = syn_exp(alpha, phi, omega)

# Defining the noise
noise=[np.random.normal(0) for k in range(len(f_syn))]
noise=noise/linalg.norm(noise)

# eps: noise level, also try 10^(-0)
eps =  10^(-1);
#eps = 0

# f = f_syn added with random noise
f = f_syn + eps*noise;
f

# Extract only the real values of f
X=f.real
# Plotting x (real values of the synthetic time series)
plt.plot(X)
plt.title('Read value of the synthetic data')
plt.xlabel('Time')
plt.ylabel('Magnitude')
plt.show
# Redefining the time series length
N = n*2;

# The following codes refer to the MATLAB codes displayed in https://uk.mathworks.com/matlabcentral/fileexchange/58967-singular-spectrum-analysis-beginners-guide
print('The following codes refer to the MATLAB codes displayed in https://uk.mathworks.com/matlabcentral/fileexchange/58967-singular-spectrum-analysis-beginners-guide')

## Set general Parameters
M = 30;    # window length = embedding dimension

## Calculate covariance matrix (trajectory approach)
# An alternative approach is to determine C directly from the scalar product of Y,
# the time-delayed embedding of X. Although this estimation of C does not give a Toeplitz structure,
# with the eigenvectors not being symmetric or antisymmetric, it ensures a positive semi-definite
# covariance matrix.

Y=[]
for k in range(M):
    Y.append(X[k:N-M+k].tolist())
Y=np.array(Y)

# Creating the embedded covariance matrix
Cemb = np.dot(Y,(Y.transpose()))/(N-M+1)
img = plt.imshow(Cemb, origin='lower', cmap='jet', interpolation='nearest', aspect='auto')
plt.show()

## Choose covariance estimation
# Choose between Toeplitz approach (cf. Vautard & Ghil) and trajectory approach (cf. Broomhead & King).
C=Cemb

# Finding the eigenvalues (Lambda) and the eigenvectors (RHO)
[LAMBDA,RHO]=scipy.linalg.eig(C)
# Extracting the real parts of them
LAMBDA=LAMBDA.real
RHO=RHO.real

## Calculate principal components PC
# The principal components are given as the scalar product between Y,
# the time-delayed embedding of X, and the eigenvectors RHO

PC=[]
for j in range(M):
    pc=[]
    for k in range(len(Y[0])):
        pc.append(np.dot(Y[:,k],RHO[:,j]))
    PC.append(pc)
PC=np.array(PC)

# Plotting a graph comparison
RawSSA=PC[0]
for k in range(M-1):
    RawSSA=np.add(RawSSA,PC[k+1])
SmoothedSSA=PC[0]
for k in range(3-1):
    SmoothedSSA=np.add(SmoothedSSA,PC[k+1])
plt.subplot(2,1,1)
plt.title("Original vs. Reconstructed time series")
plt.plot(X[20:])
plt.plot(RawSSA/5)
plt.subplot(2,1,2)
plt.title("Reconstructed raw vs. smoothed time series")
plt.plot(RawSSA/5)
plt.plot(SmoothedSSA/5)



 

 

 

 

Tuesday, November 28, 2023

Kochiya Sanae, Touhou Project

 





 

Tool used: Clip Studio Paint 

These illustrations are posted in my Pixiv page so please visit https://www.pixiv.net/en/users/32334195!   

I have actually put an effort for this posing of Sanae-chan: have taken this croquis technique for this posing!