[Draft]Adjustment to the PCA Approach · Pull Request #41 · hudson-and-thames/arbitragelab

ghost · 2021-03-02T16:38:56Z

Purpose

Describe the problem or feature in addition to a link to the issues.

Approach

How does this change address the problem?

Tests for New Behavior

What new tests were added to cover new features or behaviors?

Checklist

Make sure you did the following (if applicable):

Added tests for any new features or behaviors.
Ran ./pylint to make sure code style is consistent.
Built and reviewed the docs.
Added a note to the changelog.

Learning

Describe the research stage

Links to blog posts, patterns, libraries or addons used to solve this problem

…ke 45%) by PCA factors.

ghost · 2021-03-07T01:49:42Z

Add an option to use a variable value of explained variance(like 45%, 55%, 65%) by PCA factors.

…tragelab into basic_PCA

PanPip

Good progress 👌

I left some comments regarding the code that we can discuss.

Now we should polish the docstrings and start writing the sphinx docs.

PanPip · 2021-03-18T14:29:07Z


-
 # pylint: disable=invalid-name
+# pylint: disable=R0913


Let's rather do

# pylint: disable=invalid-name, too-many-arguments

PanPip · 2021-03-18T14:29:52Z

+        :param matrix: (pd.DataFrame) DataFrame with returns that need to be standardized.
+        :param vol_matrix: (pd.DataFrame) DataFrame with histoircal trading volume data.
+        :param k: (int) Look-back window used for volume moving average.
+        :return: (pd.DataFrame) a volume-adjusted returns dataFrame


:return: (pd.DataFrame) A volume-adjusted returns dataFrame.

PanPip · 2021-03-18T14:32:19Z

+        # Fill missing data with preceding values
+        returns = matrix.dropna(axis=0)


Should we rather fill values?

PanPip · 2021-03-18T14:37:00Z

+        # Standardized: fill nan with zero / std: fill nan with 1
+


This can probably be removed now.

PanPip · 2021-03-18T14:38:25Z

        So the output is a dataframe containing the weight for each asset in a portfolio for each eigen vector.

        :param matrix: (pd.DataFrame) Dataframe with index and columns containing asset returns.
+        :param explained_var (float) The user-defined explained variance criteria.


We should add that if this parameter is given it will override the n_components parameter. And also mention that it should've in the range from 0 to 1.

PanPip · 2021-03-18T14:51:02Z

+Tests the PCA Strategy from the Other Approaches module.
+"""
+
+import unittest
+import os
+import pandas as pd
+import numpy as np
+from arbitragelab.other_approaches import ETFStrategy
+
+
+class TestPCAStrategy(unittest.TestCase):
+    """
+    Tests PCAStrategy class.


The naming should be fixed.

PanPip · 2021-03-18T14:52:47Z

+        # Check target weights
+        self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)
+
+        # Check drift argument
+        target_weights = self.etf_strategy.get_signals(smaller_etf, smaller_dataset, k=1, corr_window=252,
+                                                       residual_window=60, sbo=1.25, sso=1.25, ssc=0.5,
+                                                       sbc=0.75, size=1, drift=True)
+
+        # Check target weights
+        self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)


It's interesting that these test values are the same.

PanPip · 2021-03-18T14:53:40Z

+        # Check target weights
+        self.assertAlmostEqual(target_weights.mean()['EEM'], 0.333333, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['BND'], -0.5, delta=1e-5)
+        self.assertAlmostEqual(target_weights.mean()['SPY'], -0.38888, delta=1e-5)


And these too. Can we pick the values of the parameters so the outputs are different?

PanPip · 2021-03-18T14:54:37Z

+
+    def __init__(self, n_components: int = 15):
+        """
+        Initialize PCA StatArb Strategy.


Docstrings in this class should be fixed.

PanPip · 2021-03-18T14:56:31Z

+        First, the correlation matrix to get PCA components is calculated using a
+        corr_window parameter. From this, we get weights to calculate PCA factor returns.
+        These weights are being recalculated each time we generate (residual_window) number
+        of signals.


All these descriptions should be updated to match the ETF Approach.

PanPip

Made some code fixes to this PR.

PanPip · 2021-03-19T14:50:35Z

+            condition = min(np.cumsum(expl_variance), key=lambda x: abs(x - explained_var))
+            # The number of components to use
+            num_pc = np.where(np.cumsum(expl_variance) == condition)[0][0] + 1


This part is not working as expected, I'll show an example.

PanPip · 2021-03-19T14:51:12Z

+        A function to calculate weights (scaled eigen vectors) to use for factor return calculation with
+        asymptotic PCA.
+
+        Weights are calculated from PCA components as:
+
+        Weight = Eigen vector / std.(R)
+
+        So the output is a dataframe containing the weight for each asset in a portfolio for each eigen vector.


Please adjust this docstring to reflect the idea behind the asym PCA.

Create pca approach section in the doc

2674366

ghost added documentation Improvements or additions to documentation enhancement New feature or request labels Mar 2, 2021

ghost requested a review from PanPip March 2, 2021 16:38

ghost self-assigned this Mar 2, 2021

[Draft]Add an option to use a variable value of explained variance(li…

101add9

…ke 45%) by PCA factors.

jamiekeng1016 and others added 21 commits March 6, 2021 18:00

Add the variable, explained_var in test file

73b86d0

bug fixed

fbe104e

Update pca_approach.py

6da4e78

Merge branch 'develop' into basic_PCA

74fc5ba

Merge branch 'basic_PCA' of https://github.com/hudson-and-thames/arbi…

21bcdc4

…tragelab into basic_PCA

Add stationary test, modified sscore

d8edcdd

[Draft]Add stationary test, drift.

07489bc

[Draft] add volume_modified_return function

962cf3b

[Draft] add asymptotic PCA option

dde8280

[Draft] fix explained_variance bug

b849a33

[Draft]add etf_approach.py

28e0830

[Draft] update unittest , fix bugs

5208c08

[Draft] etf_approach style check

5049ecf

[Draft] add unittest for volume_data/ add stock_volume.csv

ef5515e

[Draft]add docstring to test_volume_modified_return

82124f9

[Draft] delete comment made by mistake

bc9481d

[Draft] fix trailing white space

ae45880

[Draft] volume_modified_return bug fixed

abe6e38

[Draft] residual stationarity p_value option

9100b40

[Draft]etf_approach unittest, speed of mr adjustment,

4c98d70

[Draft]unittest adjustment

223d780

PanPip reviewed Mar 18, 2021

View reviewed changes

[Draft]docstring of etf approach

2986473

PanPip added 3 commits March 19, 2021 13:51

Adjusted the PCA Approach file structure

608432b

Adjusted PCA Strategy code logic

f5478e7

Adjusted PCA Approach unit tests

c51d8f3

PanPip reviewed Mar 19, 2021

View reviewed changes

PanPip assigned PanPip and unassigned ghost Mar 19, 2021

PanPip removed their assignment Sep 12, 2022

		# Fill missing data with preceding values
		returns = matrix.dropna(axis=0)

Conversation

ghost commented Mar 2, 2021 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Approach

Tests for New Behavior

Checklist

Learning

Uh oh!

ghost commented Mar 7, 2021

Uh oh!

PanPip left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PanPip left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ghost commented Mar 2, 2021 •

edited by ghost

Loading