This Blog is Systematic: Correlations, Weights, Multipliers.... (pysystemtrade)

Friday 29 January 2016

Correlations, Weights, Multipliers.... (pysystemtrade)

This post serves three main purposes:

Firstly, I'm going to explain the main features I've just added to my python back-testing package pysystemtrade; namely the ability to estimate parameters that were fixed before: forecast and instrument weights; plus forecast and instrument diversification multipliers.

(See here for a full list of what's in version 0.2.1)

Secondly I'll be illustrating how we'd go about calibrating a trading system (such as the one in chapter 15 of my book); actually estimating some forecast weights and instrument weights in practice. I know that some readers have struggled with understanding this (which is of course entirely my fault).

Thirdly there are some useful bits of general advice that will interest everyone who cares about practical portfolio optimisation (including both non users of pysystemtrade, and non readers of the book alike). In particular I'll talk about how to deal with missing markets, the best way to estimate portfolio statistics, pooling information across markets, and generally continue my discussion about using different methods for optimising (see here, and also here).

If you want to, you can follow along with the code, here.

Key

This is python:

system.forecastScaleCap.get_scaled_forecast("EDOLLAR", "carry").plot()

This is python output:

hello world

This is an extract from a pysystemtrade YAML configuration file:

forecast_weight_estimate:
   date_method: expanding ## other options: in_sample, rolling
   rollyears: 20
   frequency: "W" ## other options: D, M, Y

Forecast weights

A quick recap

The story so far; we have some trading rules (three variations of the EWMAC trend following rule, and a carry rule); which we're running over six instruments (Eurodollar, US 10 year bond futures, Eurostoxx, MXP USD fx, Corn, and European equity vol; V2X).

We've scaled these (as discussed in my previous post) so they have the correct scaling. So both these things are on the same scale:

system.forecastScaleCap.get_scaled_forecast("EDOLLAR", "carry").plot()

Rolldown on STIR usually positive. Notice the interest rate cycle.

system.forecastScaleCap.get_scaled_forecast("V2X", "ewmac64_256").plot()

Notice how we moved from 'risk on' to 'risk off' in early 2015

Notice the massive difference in available data - I'll come back to this problem later.

However having multiple forecasts isn't much good; we need to combine them (chapter 8). So we need some forecast weights. This is a portfolio optimisation problem. To be precise we want the best portfolio built out of things like these:

Account curves for trading rule variations, US 10 year bond future. All pretty good....

There are some issues here then which we need to address.

An alternative which has been suggested to me is to optimise the moving average rules seperately; and then as a second stage optimise the moving average group and the carry rule. This is similar in spirit to the handcrafted method I cover in my book. Whilst it's a valid approach it's not one I cover here, nor is it implemented in my code.

In or out of sample?

Personally I'm a big fan of expanding windows (see chapter 3, and also here)
nevertheless feel free to try different options by changing the configuration file elements shown here.

forecast_weight_estimate:
   date_method: expanding ## other options: in_sample, rolling
   rollyears: 20
   frequency: "W" ## other options: D, M, Y
Also the default is to use weekly returns for optimisation. This has two advantages; firstly it's faster. Secondly correlations of daily returns tend to be unrealistically low (because for example of different market closes when working across instruments).

Choose your weapon: Shrinkage, bootstrapping or one-shot?

In my last couple of posts on this subject I discussed which methods one should for optimisation (see here, and also here, and also chapter four).

I won't reiterate the discussion here in detail, but I'll explain how to configure each option.

Boostrapping

This is my favourite weapon, but it's a little ..... slow.

forecast_weight_estimate:
   method: bootstrap
   monte_runs: 100
   bootstrap_length: 50
   equalise_means: True
   equalise_vols: True

We expect our trading rule p&l to have the same standard deviation of returns, so we shouldn't need to equalise vols; it's a moot point whether we do or not. Equalising means will generally make things more robust. With more bootstrap runs, and perhaps a longer length, you'll get more stable weights.

Shrinkage

I'm not massively keen on shrinkage (see here, and also here) but it is much quicker than bootstrapping. So a good work flow might be to play around with a model using shrinkage estimation, and then for your final run use bootstrapping. It's for this reason that the pre-baked system defaults to using shrinkage. As the defaults below show I recommend shrinking the mean much more than the correlation.

forecast_weight_estimate:
   method: shrinkage
   shrinkage_SR: 0.90
   shrinkage_corr: 0.50   equalise_vols: True

Single period

Don't do it. If you must do it then I suggest equalising the means, so the result isn't completely crazy.

forecast_weight_estimate:
   method: one_period
   equalise_means: True
   equalise_vols: True

To pool or not to pool... that is a very good question

One question we should address is, do we need different forecast weights for different instruments, or can we pool our data and estimate them together? Or to put it another way does Corn behave sufficiently like Eurodollar to justify giving them the same blend of trading rules, and hence the same forecast
weights?

forecast_weight_estimate:
pool_instruments: True ##

One very significant factor in making this decision is actually costs. However I haven't yet included the code to calculate the effect of these. For the time being then we'll ignore this; though it does have a significant effect. Because of the choice of three slower EWMAC rule variations this omission isn't as serious as it would be with faster trading rules.

If you use a stupid method like one-shot then you probably will get quite different weights. However more sensible methods will account better for the noise in each instruments' estimate.

With only six instruments, and without costs, there isn't really enough information to determine whether pooling is a good thing or not. My strong prior is to assume that it is. Just for fun here are some estimates without pooling.

system.config.forecast_weight_estimate["pool_instruments"]=False
system.config.instrument_weight_estimate["method"]="bootstrap"
system.config.instrument_weight_estimate["equalise_means"]=False
system.config.instrument_weight_estimate["monte_runs"]=200
system.config.instrument_weight_estimate["bootstrap_length"]=104

system=futures_system(config=system.config)

system.combForecast.get_forecast_weights("CORN").plot()
title("CORN")
show()

Forecast weights for corn, no pooling

system.combForecast.get_forecast_weights("EDOLLAR").plot()
title("EDOLLAR")
show()

Forecast weights for eurodollar, no pooling

Note: Only instruments that share the same set of trading rule variations will see their results pooled.

Estimating statistics

There are also configuration options for the statistical estimates used in the optimisation; so for example should we use exponential weighted estimates? (this makes no sense for bootstrapping, but for other methods is a reasonable thing to do). Is there a minimum number of data points before we're happy with our estimate? Should we floor correlations at zero (short answer - yes).

forecast_weight_estimate:

   correlation_estimate:
     func: syscore.correlations.correlation_single_period
     using_exponent: False
     ew_lookback: 500
     min_periods: 20
     floor_at_zero: True

   mean_estimate:
     func: syscore.algos.mean_estimator
     using_exponent: False
     ew_lookback: 500
     min_periods: 20

   vol_estimate:
     func: syscore.algos.vol_estimator
     using_exponent: False
     ew_lookback: 500
     min_periods: 20

Checking my intuition

Here's what we get when we actually run everything with some sensible parameters:

system=futures_system()
system.config.forecast_weight_estimate["pool_instruments"]=True
system.config.forecast_weight_estimate["method"]="bootstrap"
system.config.forecast_weight_estimate["equalise_means"]=False
system.config.forecast_weight_estimate["monte_runs"]=200
system.config.forecast_weight_estimate["bootstrap_length"]=104

system=futures_system(config=system.config)

system.combForecast.get_raw_forecast_weights("CORN").plot()
title("CORN")
show()

Raw forecast weights pooled across instruments. Bumpy ride.

Although I've plotted these for corn, they will be the same across all instruments. Almost half the weight goes in carry; makes sense since this is relatively uncorrelated (half is what my simple optimisation method - handcrafting - would put in). Hardly any (about 10%) goes into the medium speed trend following rule; it is highly correlated with the other two rules. Out of the remaining variations the faster one gets a higher weight; this is the law of active management at play I guess.

Smooth operator - how not to incur costs changing weights

Notice how jagged the lines above are. That's because I'm estimating weights annually. This is kind of silly; I don't really have tons more information after 12 months; the forecast weights are estimates - which is a posh way of saying they are guesses. There's no point incurring trading costs when we update these with another year of data.

The solution is to apply a smooth:

forecast_weight_estimate:
ewma_span: 125
cleaning: True

Now if we plot forecast_weights, rather than the raw version, we get this:

system.combForecast.get_forecast_weights("CORN").plot()
title("CORN")
show()

Smoothed forecast weights (pooled across all instruments)

There's still some movement; but any turnover from changing these parameters will be swamped by the trading the rest of the system is doing.

Forecast diversification multiplier

Now we have some weights we need to estimate the forecast diversification multiplier; so that our portfolio of forecasts has the right scale (an average absolute value of 10 is my own preference).

Correlations

First we need to get some correlations. The more correlated the forecasts are, the lower the multiplier will be. As you can see from the config options we again have the option of pooling our correlation estimates.

forecast_correlation_estimate:
   pool_instruments: True
   func: syscore.correlations.CorrelationEstimator ## function to use for estimation. This handles both pooled and non pooled data
   frequency: "W"   # frequency to downsample to before estimating correlations
   date_method: "expanding" # what kind of window to use in backtest
   using_exponent: True # use an exponentially weighted correlation, or all the values equally
   ew_lookback: 250 ## lookback when using exponential weighting
   min_periods: 20 # min_periods, used for both exponential, and non exponential weighting

Smoothing, again

We estimate correlations, and weights, annually. Thus as with weightings it's prudent to apply a smooth to the multiplier. I also floor negative correlations to avoid getting very large values for the multiplier.

forecast_div_mult_estimate:
ewma_span: 125 ## smooth to apply
floor_at_zero: True ## floor negative correlations

system.combForecast.get_forecast_diversification_multiplier("EDOLLAR").plot()
show()

system.combForecast.get_forecast_diversification_multiplier("V2X").plot()
show()

Forecast Div. Multiplier for Eurodollar futures

Notice that when we don't have sufficient data to calculate correlations, or weights, the FDM comes out with a value of 1.0. I'll discuss this more below in "dealing with incomplete data".

From subsystem to system

We've now got a combined forecast for each instrument - the weighted sum of trading rule forecasts, multiplied by the FDM. It will look very much like this:

system.combForecast.get_combined_forecast("EUROSTX").plot()
show()

Combined forecast for Eurostoxx. Note the average absolute forecast is around 10. Clearly a choppy year for stocks.

Using chapters 9 and 10 we can now scale this into a subsystem position. A subsystem is my terminology for a system that trades just one instrument. Essentially we pretend we're using our entire capital for just this one thing.

Going pretty quickly through the calculations (since you're eithier familar with them, or you just don't care):

system.positionSize.get_price_volatility("EUROSTX").plot()
show()

Eurostoxx instrument value volatility. A bit less than 1% a day in 2014, a little more exciting recently.

system.positionSize.get_block_value("EUROSTX").plot()
show()

Block value (value of 1% change in price) for Eurostoxx.

system.positionSize.get_instrument_currency_vol("EUROSTX").plot()
show()

Eurostoxx: Instrument currency value: Volatility in euros per day

system.positionSize.get_instrument_value_vol("EUROSTX").plot()
show()

Eurostoxx instrument value volatility: volatility in base currency ($) per day, per contract

system.positionSize.get_volatility_scalar("EUROSTX").plot()
show()

Eurostoxx vol scalar: Number of contracts we'd hold in a subsystem with a forecast of +10

system.positionSize.get_subsystem_position("EUROSTX").plot()
show()

Eurostoxx subsystem position

Instrument weights

We're not actually trading subsystems; instead we're trading a portfolio of them. So we need to split our capital - for this we need instrument weights. Oh yes, it's another optimisation problem, with the assets in our portfolio being subsystems, one per instrument.

import pandas as pd

instrument_codes=system.get_instrument_list()

pandl_subsystems=[system.accounts.pandl_for_subsystem(code, percentage=True)
for code in instrument_codes]

pandl=pd.concat(pandl_subsystems, axis=1)
pandl.columns=instrument_codes

pandl=pandl.cumsum().plot()
show()

Account curves for instrument subsystems

Most of the issues we face are similar to those for forecast weights (except pooling. You don't have to worry about that anymore). But there are a couple more annoying wrinkles we need to consider.

Missing in action: dealing with incomplete data

As the previous plot illustrates we have a mismatch in available history for different instruments - loads for Eurodollar, Corn, US10; quite a lot for MXP, barely any for Eurostoxx and V2X.

This could also be a problem for forecasts, at least in theory, and the code will deal with it in the same way.

Remember when testing out of sample I usually recalculate weights annually. Thus on the first day of each new 12 month period I face having one or more of these beasts in my portfolio:

Assets which weren't in my fitting period, and aren't used this year
Assets which weren't in my fitting period, but are used this year
Assets which are in some of my fitting period, and are used this year
Assets which are in all of the fitting period, and are used this year

Option 1 is easy - we give them a zero weight.

Option 4 is also easy; we use the data in the fitting period to estimate the relevant statistics.

Option 2 is relatively easy - we give them an "downweighted average" weight. Let me explain. Let's say we have two assets already, each with 50% weight. If we were to add a further asset we'd allocate it an average weight of 33.3%, and split the rest between the existing assets. In practice I want to penalise new assets; so I only give them half their average weight. In this simple example I'd give the new asset half of 33.3%, or 16.66%.

We can turn off this behaviour, which I call cleaning. If we do we'd get zero weights for assets without enough data.

instrument_weight_estimate:
cleaning: False

Option 3 depends on the method we're using. If we're using shrinkage or one period, then as long as there's enough data to exceed minimum periods (default 20 weeks) then we'll have an estimate. If we haven't got enough data, then it will be treated as a missing weight; and we'd use downweighted average weights (if cleaning is on), or give the absent instruments a zero weight (with cleaning off)

For bootstrapping we check to see if the minimum period threshold is met on each bootstrap run. If it isn't then we use average weights when cleaning is on. The less data we have, the closer the weight will be to average. This has a nice Bayesian feel about it, don't you think? With cleaning off, less data will mean weights will be closer to zero. This is like an ultra conservative Bayesian.

If you don't get this joke, there's no point in me trying to explain it (Source: www.lancaster.ac.uk)

Let's plot them

We're now in a position to optimise, and plot the weights:

(By the way because of all the code we need to deal properly with missing weights on each run, this is kind of slow. But you shouldn't be refitting your system that often...)

system.config.instrument_weight_estimate["method"]="bootstrap" ## speed things up
system.config.instrument_weight_estimate["equalise_means"]=False
system.config.instrument_weight_estimate["monte_runs"]=200
system.config.instrument_weight_estimate["bootstrap_length"]=104

system.portfolio.get_instrument_weights().plot()
show()

Optimised instrument weights

These weights are a bit different from equal weights, in particular the better performance of US 10 year and Eurodollar is being rewarded somewhat. If you were uncomfortable with this you could turn equalise means on.

Instrument diversification multiplier

Missing in action, take two

Missing instruments also affects estimates of correlations. You know, the correlations we need to estimate the diversification multiplier. So there's cleaning again:

instrument_correlation_estimate:
cleaning: True

I replace missing correlation estimates* with the average correlation, but I don't downweight it. If I downweighted the average correlation the diversification multiplier would be biased upwards - i.e. I'd have too much risk on. Bad thing. I could of course use an upweighted average; but I'm already penalising instruments without enough data by giving them lower weights.

* where I need to, i.e. options two and three

Let's plot it

system.portfolio.get_instrument_diversification_multiplier().plot()
show()

Instrument diversification multiplier

And finally...

We can now work out the notional positions - allowing for subsystem positions, weighted by instrument weight, and multiplied by instrument diversification multiplier.

system.portfolio.get_notional_position().plot("EUROSTX")
show()

Final position in Eurostoxx. The actual position will be a rounded version of this.

End of post

No quant post would be complete without an account curve and a Sharpe Ratio.

And an equation. Bugger, I forgot to put an equation in.... but you got a Bayesian cartoon - surely that's enough?

print(system.accounts.portfolio().stats())

system.accounts.portfolio().cumsum().plot()

show()

Overall performance. Sharpe ratio is 0.53. Annualised standard deviation is 27.7% (target 25%)

Stats: [[('min', '-0.3685'), ('max', '0.1475'), ('median', '0.0004598'),
('mean', '0.0005741'), ('std', '0.01732'), ('skew', '-1.564'),
('ann_daily_mean', '0.147'), ('ann_daily_std', '0.2771'),
('sharpe', '0.5304'), ('sortino', '0.6241'), ('avg_drawdown', '-0.2445'), ('time_in_drawdown', '0.9626'), ('calmar', '0.2417'),
('avg_return_to_drawdown', '0.6011'), ('avg_loss', '-0.011'),
('avg_gain', '0.01102'), ('gaintolossratio', '1.002'),
('profitfactor', '1.111'), ('hitrate', '0.5258')]

This is a better output than the version with fixed weights and diversification multiplier that I've posted before; mainly because a variable multiplier leads to a more stable volatility profile over time, and thus a higher Sharpe Ratio.

153 comments:

Max6 February 2016 at 19:30
Rob, again thank you for the article.
I am curious, is there a easy way to feed your system directly from Quandl instead of legacyCSV files?
ReplyDelete
Replies
Rob Carver7 February 2016 at 14:22
Getting data from quandl python api is very easy. The hard thing is to produce the two kinds of data - stitched prices (although quandl do have this) and aligned individual contracts for carry. So the hard bit at least for futures trading is writing the piece that takes raw individual contracts and produces these two things.

This is on my list to do...
ReplyDelete
Replies
AvantGarde9 February 2016 at 06:07
I had a few Q's on above:

OPTIMISATION

When you optimise to assign weights to rules, what do you do in your OWN system:
1. i) do you optimise the weights for each trading rule based on each instrument individually, so each trading rule has a different weight depending on the instrument, or ii) do you optimise the weights for trading rules based on pooled data across all instruments?
2. if the answer above is ii) how do you assign the WEIGHTS TO THE INSTRUMENTS when you pool them in the optimisation to determine the WEIGHTS TO THE TRADING RULES? Are the instrument weights determined in a prior optimisation before assigning weights to trading rules? Is your process to first optimise the weights assigned to each instrument, and after this is done you pool the instruments based on these weights to optimise the for the weights for each trading rule?

FORECAST SCALARS

When we calculate average forecast scalars, what do you personally do:
1. do you calculate the median or arithmetic average?
2. in order to calculate the average, do you personally pool all the instruments, or do you take the average forecast from each instrument individually?

Apologies for the caps, could not find any other way to add emphasis.
ReplyDelete
Replies
AvantGarde10 February 2016 at 00:16
Sorry Rob, I am still trying to wrap my head around this. So to confirm, the instrument weights are determined in a SEPARATE optimisation that is INDEPENDENT from the optimisation of the weights assigned to trading rules? So two separate optimisations?
ReplyDelete
Replies
AvantGarde10 February 2016 at 06:27
OK, this is clear in my mind now. Thank you!
ReplyDelete
Replies
Darrin6 March 2016 at 20:02
Hi Rob,

Can you perhaps write a blog post about how the Semi Automated trader could develop scaled forecasts? In the book, the examples of CFD bets (not available to those of us in the US) is very helpful, but what if we like the way in which your signals fluctuate from moderately strong to stronger?

ReplyDelete
Replies
Darrin10 March 2016 at 17:50
Right, that makes sense. Perhaps i'm just not fully understanding. Based on the walk-through examples in the book for the Semi-automatic trader using CFD's, the signals aren't combined or anything fancy like that. Like you said, its just a matter of translating gut feel into an integer.

I just wanted to know if it were possible for the discretionary trader to develop a weighted combined forecast, similar to the staunch systems trader. One of the most attractive features of your system is the fact that the signal generation is done for you on a routine basis.

Based on my limited understanding, it seemed like the semi-automatic trader is limited to explicit stop losses and arbitrary binary trading.

ReplyDelete
Replies
Kris20 March 2016 at 11:19
Hi Robert,

I've a question about forecast weights.

At first, more theoretical...
I want to use bootstrapping to determine the forecast weights. I think it's best to calculate separate forecast weights for each element because the cost/instrument can vary substantially per instrument. Also in my opinion it's important to take into account the trading costs for the calculation of the forecast weights, because a fast trading system will generate a lot of trading costs (I work with CFD's) and I think a lower participation in the combined forecast for the faster system will be better.
Do you agree with this ideas ?

Now more practical...
My idea is to calculate a performance curve for each trading rule variation for each instrument and use this performance curves for bootstrapping.

Is the following method correct :
1. Daily calculation per instrument en per trading rule variation
- calculate scaled forecast
- calculate volatility scaler
- calculate number of contracts
- calculate profitloss (including trading costs)
- create accountcurve

2. use bootstrapping method per instrument using all the account curves for all used trading rule variations. The result should be the forecast weights per instrument (subsystem)

Is this the correct way ?

Thank you
Kris
ReplyDelete
Replies
Roei4 April 2016 at 20:38
I was listening to Perry Kaufman podcast on Better System trader, and he said that true volatility adjustment doesn't work for stocks.

The argument is that because stock has low leverage and if you trading a stock with low volatility you will need to invest a lot of money to bring that volatility to mach other stock and you may not have enough money to do that. Another option is to reduce to position of the other stocks but then you not using all the money.

What he suggested is to dividing equal investment by stock price.

I wonder that your thoughts on this?
ReplyDelete
Replies
Anonymous11 April 2016 at 03:35
I have two questions:

1.) I may have missed somewhere if you mentioned it, but how do you manage hedging currencies? It seems like your trading in pounds, so for instance how do you hedge contracts denominated in AUD?

2.) What is your margin to equity? This is something I keep hearing about. For instance backtesting a few different strategies and running the margins in CME database shows a margin to equity of about 35% when I am targeting 15% vol. This seems high compared to other managed futures strategies that say about 15% margin to equity and have higher volatility(even while trading more markets than I). Any thoughts would be more than appreciated!!

ReplyDelete
Replies
Anonymous13 April 2016 at 14:57
I hope you don't mind questions!

You say you have a 10% buffer around the current position(i.e if the weight at rebalance is 50% and the target is 45%, you keep it at 50% because it is within 10%). However, what if you have a situation where the position changes from, say, +5% to -4%? This is within the 10% buffer but the signs have changed, what do you do with your position?
ReplyDelete
Replies
AvantGarde1 May 2016 at 07:42
Hi Rob,
If you don't mind me asking, are your log-scale equity curve charts in base 'e' or base 10?
Thanks
ReplyDelete
Replies
AvantGarde4 June 2016 at 00:15
Also, from what I have read, it seems your instrument and rule weights are only updated each time a new instrument enters your system, so you hardcode these weights in your own config; however, these weights do incrementally change each day as you apply a smooth to them. How can one set this up in pysystemtrade? I understand how you hardcode the weights in the config, but how do I apply a smooth to them in pysystemtrade? Or is this done automatically if I included e.g., 'instrument_weight_estimate: ewma_span: 125' in the config?
ReplyDelete
Replies
Wesay GAINZ7 June 2016 at 07:33
Hi Rob,
If I wanted to apply a trading rule to one instrument, say ewmac8_32 just to Corn, and another trading rule to another instrument, say ewmac32_128 just to US10, and combine them into a portfolio so that I could get the account statistics, how could I do that? The typical method of creating systems obviously applies each trading rule to each market.

I suspect that this would have to be done at the TradingRule stage such that a TradingRule (consisting of the rule, data, and other_args) would be constructed for the 2 cases above. However, I'm having trouble passing the correct "list" of data to the TradingRule object. And, if that is possible, what would need to be passed in for the "data" at the System level i.e. my_system=System([my_rules], data)? I suspect that if all this is possible, it could also be done with a YAML file correct? Thank you so much for any advice and pointing me in the right direction!
ReplyDelete
Replies
Wesay GAINZ8 June 2016 at 06:37
Thank you, will test this out!
ReplyDelete
Replies
Dmitry21 June 2016 at 12:04
Hi Rob,
Thank you for an excellent book. I am trying to rewrite some parts of your system in a different language (broker doesn't support python) and add live trading. However I got a bit stuck while I was trying to reproduce the calculations of volatility scalar. For some reason when I request system.positionSize.get_volatility_scalar("CORN") I receive just a series of NaNs, but the subsystem position is somehow calculated. Don't really understand why is that happening
ReplyDelete
Replies
Murat4 August 2016 at 10:31
Hi Rob, I tried to reproduce forecast weight estimation with pooling, and bootstrap, using this code

from matplotlib.pyplot import show, title
from systems.provided.futures_chapter15.estimatedsystem import futures_system

system=futures_system()
system.config.forecast_weight_estimate["pool_instruments"]=True
system.config.forecast_weight_estimate["method"]="bootstrap"
system.config.forecast_weight_estimate["equalise_means"]=False
system.config.forecast_weight_estimate["monte_runs"]=200
system.config.forecast_weight_estimate["bootstrap_length"]=104

system=futures_system(config=system.config)

system.combForecast.get_raw_forecast_weights("CORN").plot()
title("CORN")
show()

The output came out different than your results,

https://dl.dropboxusercontent.com/u/5114340/tmp/weights.png
https://dl.dropboxusercontent.com/u/5114340/tmp/weights.log

Did I have to configure somethings through YAML as well as Python code? It seemed like the code above was enough.

Thanks,
ReplyDelete
Replies
Murat25 August 2016 at 10:08
One coding question for the correlation matrix - Chapter 15 example system. With this code,

http://goo.gl/2caO1K

I get 0.89 for E$-US10 correlation, Table 46 in Systematic Trading says 0.35. I understand ST table combines existing numbers for that number, but the difference seems too big. Maybe I did something wrong in the code? I take PNL results for each instrument, and feed it all to CorrelationEstimator.

Thanks,
ReplyDelete
Replies
P_Ser28 August 2016 at 16:50
Dear Rob,

where can I find information on how to calculate account curves for trading rule variations from raw forecasts?

Do I assume I use my whole trading capital for my cash volatility target to calculate position size and then return, or should i pick certin % volatility target assuming ("guessing") in advanve a certain sharp ratio i'm planing to achieve on my portfolio?

Thanks,
Peter
ReplyDelete
Replies
Kris15 September 2016 at 18:59
Hi Rob,

I'm searching for the historical data on the websites you mentioned in the book. I'm looking to the six instruments you also use in this post. On Quandl I can find continuous contracts but this use rollover method at contract expiry and there is no price adjustment. I'm wondering if this is good enough to backtesting because the effective rolling is total different then the (free) data from Quandl. Also with the premium subscription there are a limited methods for rolling. For example : if we roll corn futures in the summer and working only on december contracts, I think this is not possible with quandl (and I think also other data providers like CSIData.com). I'm thinking to write my own rolling methods myself. Is this a good idea and is it necessary to do this (=time consuming). How do you handle this problem ?

Kris
ReplyDelete
Replies
Rob Carver15 September 2016 at 19:08
This comment has been removed by the author.
ReplyDelete
Replies
P_Ser6 January 2017 at 22:15
Hi, Rob! I'm struggling with forecast correlation estimates used for fdm calculation, could you plz explain what is ew_lookback parameter and how exactly you calculate ewma correlations?

E.g. With pooled weekly returns i use first ew_lookback = 250 data points to calculate ewma correlations, then expand my window to 500 data points and calculate correlations on this new set using 500 ewma e.t.c? Why use 250 and not t 52 if use weekly returns?

Thank you!
ReplyDelete
Replies
lvymath17 January 2017 at 15:31
Hello, after looking through the python code, I wonder how you came up with the adj_factor for costs when estimating forecast weights? via simulation? THANKS!
ReplyDelete
Replies
Chad B19 February 2017 at 17:43
Hello Rob,
Would you consider making the ewma_span period for smoothing your forecast weights a variable instead of fixed value, perhaps by some additional logic to detect different volatility 'regimes' that are seen in the market? Or maybe such a notion is fair, but this is the wrong place to apply it, and should be applied at the individual instrument level or in strategy scripts?
ReplyDelete
Replies
Kris27 February 2017 at 19:13
Hi Robert,

For the diversification multiplier you mention to use exponential weighting. Where or how you implement this? On the returns or on the deviations of the returns from the expected returns (so just before the calculation of the covariances)? Or maybe at an other place ?

Can you give me some direction?

Thanks

Kris
ReplyDelete
Replies
Dolph13 March 2017 at 16:10
Rob, in your legacy.csv modules, some specific futures have the "price contract" as the "front month"(closest contract) like Bund, US20 & US10, etc. meanwhile, others such as Wheat, , gas, crude, etc have the "carry contract" as the front month. is this by design?
ReplyDelete
Replies
Deano20 March 2017 at 22:52
Hi Rob,

Thank you so much for your book. It it very educative. I was trying to understand more about trading rules correlations in "Chapter 8: Combined Forecasts". You mentioned that back-testing the performance of trading rules to get correlation.

Could you share a bit more insights on how you get the performance of trading rules, please?
(1) Do you set buy/sell threshold at +/- 10? meaning that no position held when signal is [-10,10], only 1 position held when signal is [10,20] and [-20,-10] and 2 positions held when signal is at -20/+20?
(2) Trading cost is considered? (I think the answer is yes.)
(3) You entry a buy trade, say at signal=10. When do you signal to exit the trade? when signal<10 or signal=0?

or you use dynamic positions, meaning the position varies with signal all the time.

Another question regarding optimisation:
In the formula: f*w - lemada*w*sigma*w' to estimate weights
(1) f is rules' sharpe ratio calculated using the rules' historical performance pooled from all instruments or just the sharpe of the rule from the instrument we look at?
(2) how do you define lemada? =0.0001? if so, is it always 0.0001?

Sorry if those two questions had been asked before.

Thanks,
Deano
ReplyDelete
Replies
Cal222329 April 2017 at 22:43
Rob,

Is one way to estimate correlations with nonsynchronous trading to run correlations on rolling 3-day returns over a lookback of 60-days?(which I know is much shorter than yours)
ReplyDelete
Replies
Cal22233 May 2017 at 15:44
Thank you! The idea actually came from the Betting Against Beta paper by the guys at AQR. They say they use overlapping(or rolling) 3-day log returns to calculate correlations to control for non-synchronous trading over 120 trading days.

I think it is safe to say your disagreeing with their approach?
ReplyDelete
Replies
Chad B18 July 2017 at 05:04
Good morning, Rob.
When I run your ch_15 system along with default configs, trading rules, etc. unmodified, the stages run fine. If I substitute the legacyCSV files of several instruments with *intraday* 1-minute bars in the same 2-column format, both for the '_price' file and '_carrydata' (expiration months spaced from current date like you showed in legacy versions), spanning 5 days each, and re-running the system but changing nothing except reduction of the instrument_list entries, I get the error from line 530 (get_notional_position) of /systems/portfolio.py "No rules are cheap enough for CRUDE_W with threshold of 1.300 SR units! Raise threshold (...), add rules, or drop instrument."
I raised it from the original 0.13 to 1.3, and in other tests as high as 100 (ridiculous value of course, just testing...), same result. Seems I'm overlooking a simple principle of the system, but I can't figure why, given the trading rules were left same. Can you offer a pointer?
ReplyDelete
Replies
Chad B18 July 2017 at 05:07
Also, since many of the instruments in the legacy data have a lot of days near the final years of the records with more than one recorded value per day, it seems that using new CSVs with intraday data would be feasible in short order, but making sure to change the period of several calculations in other stages to recognize the periodicity on a minute scale, instead of days, no?
Sorry in advance for my hasty monologue...
ReplyDelete
Replies
Chad B18 July 2017 at 05:09
P.S. for example, does the diversification multiplier need to be modified for interpreting 1-minute periods instead of sampling at end-of-day? What about volatility scaling floors currently set with daily period?
ReplyDelete
Replies
Unknown25 July 2017 at 10:26
Dear Mr. Carver,

Most of all I always appreciate you for sharing detailed & practical knowledge of quantitative trading.

I have a few questions while reading your book & blog posts.

I am trying to develop a trend following trading system with ETFs using the framework in your book.

The trading system is long-only and constrained by leverage limit(100%).

Under the constraints, what is the best way to use your framework properly?

Is there any changes in calculating forecast scalars, forecast weights, FDM, IDM, etc?

My thought is...

Solution 1.
- Maintain all the procedures in your framework as if I could long/short and have no leverage limit. (Suppose that I have 15% target vol)

- When I calculate positions for trading, I just assign zero position for negative forecasts.

And if sum of long positions exceeds my capital I scale down the positions so that the portfolio could not be leveraged.

Solution 2.
- Forecast scalar:
No change. I calculate forecasts and scale them(-20 ~ + 20).

- Forecast weights, Correlation:
For each trading rules,
+ Calculate portfolio returns of pooled instruments according to the forecasts.
+ Returns for negative forecasts replace to zeros. (Zero position instead of short)
+ And I scale down the returns for positive forecasts when sum of long positions exceeds my capital.
+ Returns of trading rules are used when bootstrapping or calculating correlations.
+ Forecast weights are optimized using these returns.
- FDM:
+ Calculate FDM based on forecast weights and correlations among the forecasts as your framework.
+ Calculate the historical participations(= sum(long position)/myCapital) using new rescaled forecasts and forecast weights.
+ Check the Median(participations) for back-tested period.
+ If it exceeds 100% I scaled down FDM in order to get my portfolio not take too much risk.

Frankly speaking I don't know what the right ways are. Both ways does not seem proper. Maybe it is because of my lack of understading.

Would you give any advice?

I am really looking forward to your 2nd book. Thanks for reading.

Best regards,

Michael Kim
ReplyDelete
Replies
Unknown25 July 2017 at 10:27
This comment has been removed by a blog administrator.
ReplyDelete
Replies
Patrick25 August 2017 at 08:54
Hi Rob, I have a question with regard to setting up data prior to optimising weights using bootstrapping. If we follow your advice, forecast returns are already standardised across instruments through dividing by say 36-day EWMA vol. However, I understand from the above example, it makes sense also to equalise vols and means. I assume the vol_equaliser fn does this by rescaling the time series of returns so that all the forecast distributions are virtually the same over the entire series (i.e. have identical Sharpes). The weights you derive would presumably be that of a min variance portfolio and therefore relies on a solution based entirely on the correlations between the returns. Is the above correct? I assume you recommend the same procedure for bootstrapping subsystem weights (i.e. equalise means and vol). Now when using pooled data for forecasts, my thinking is fuzzier: is it advisable not to equalise means or vol?
ReplyDelete
Replies
Patrick25 August 2017 at 15:08
Hi Rob thanks for this. Just so I can get my head around your answer a bit better, a question on terminology: are 'estimated' vol and 'realised' vol the same thing and equal to the vol used for standardisation (i.e. rolling historic 36d EWM vol)? As I understand it the two inputs into the the optimisation you do are correlation and mean returns. So are you saying that if we relied merely on vol standardisation (using recent realised vol) then a period of high vol for an instrument with a short data history but high forecasts would lower the forecasts and their corresponding weights? I am failing to make the connection between high forecasts and high price vol which is used for standardisation. I am sorry if I have completely missed the point.

On a related point, and I should have asked this earlier: on page 289 of your book you recommend that prior to optimisation we should ensure 'returns have been vol normalised' I assume this is the same as 'equalisation' that you refer to in this post and not the same as standardisation (btw the term volatility normalised is in bold so perhaps your publishers might consider putting a reference in the glossary for future editions before your book becomes compulsory reading for our grandkids).

ReplyDelete
Replies
Patrick28 August 2017 at 13:08
Hi Rob, it's possible I worded my original question poorly,

I will try to be more systematic, so please bear with me. To recap on my understanding:

1.'vol normalisation' is what you do when standardising forecasts. This is typically done by dividing by rolling 36 day EWMA * current price
2. 'vol standardisation' is what you do when standardising subsystems. I would again use rolling 36 day EWM vol (times block value, etc) for this
3. 'vol equalisation' is what you do prior to optimisation to scale the returns over the entire (expanding) window so over this window they have the same volatility

4. Assuming the above is correct, a subsystem position for carry and EWMAC variation is proportional to exp return/vol^2 (which co-incidentally seems to be proportional to optimal kelly scale - although not for the breakout rule).

5. When I said originally 'Now when using pooled data for forecasts, my thinking is fuzzier: is it advisable not to equalise means or vol?', to be clearer I was trying to ask whether it makes sense to equalise vols prior to optimising forecast weights when pooling (not whether to equalise vols when optimising subsystem weights, if we had pooled forecasts previously). In a previous post ('a little demonstration of portfolio optmisation') you do an asset vol 'normalisation' which I believe is the same as 'equalisation' discussed here (scale the whole window, although not done on an expanding window) but I got the impression for forecasts that the normalisation is handled as above and this took care of the need for further equalisation (for forecasts at least).

I must admit, I had always thought that if you want to use only correlations and means to optimise then intuitively you should equalise vols in the window being sampled (because to quote you this reduces the covar matrix to a correlation matrix). However I had somehow accepted the fact that normalising forecasts by recent vol ended up doing something similar (also from reading some comments by you about not strictly needing to equalise vols for forecasts, etc). But I guess a different issue arises when pooling short histories?

In summary, assuming you deciphered my original question correctly, are you saying it is still important to equalise vols of forecast as the 'realised' variance of forecast returns are proportional to the the level of the forecast (so a forecast of 10 would have twice the variance of a forecast of 5), causing the optimiser to downweight elevated forecasts, which is a problem when pooling short data histories? By equalising vols over the entire window being used for optimisation, we end up removing this effect? If that is what you are saying then I promise to go away and think about this much more deeply.

Thanks again for taking the time.
ReplyDelete
Replies
Patrick30 August 2017 at 19:35
OK I think get it. Really had to have a think and run some simulations but as far as I can tell there seem to be two effects in play here. The arithmetic returns from applying a rule on an instrument is the product of two rvs: instrument returns and forecasts. Assuming independence between these rvs, the variance can be shown to be function of their first two moments. Over sufficiently long periods, these moments across different instruments are equal (asymptotically converge). However over shorter periods there may be divergence (i.e. different averages, different vols) which will violate the assumption of equal vols required to be able to run the optimiser using correlations only. As far as I can tell there also is a more subtle effect, and that arises from the fact that forecasts and instrument returns are not independent (EWMAC 2,8 and daily returns when using random gaussian data have a correlation of 45%). This inconveniently introduces covariance terms in the calc for vol. However, in the cross section of a single rule applied across different instruments over sufficiently long periods of time, the covar terms should have an equal effect across all instruments. Again over short periods there may be divergence. This divergence in small samples from the assumed distribution of the population is presumably why it is sensible to equalise vols before optimising. Am I on the right track?

BTW please feel free to delete my earlier comment.
ReplyDelete
Replies
Matt31 August 2017 at 13:08
Hi Rob,

Do you have the breakdown of subsystem signals for the Eurostoxx? You never get short in 2015, only less long? It looks like the market heads down quite a bit. Is this because of the carry signal dwarfing the trend signal? Optically, I can't line up the forecast weights with the chart.

Thanks!
ReplyDelete
Replies
Patrick4 September 2017 at 10:34
Understood. And thank you again, Rob.
ReplyDelete
Replies
Chad B19 February 2018 at 20:50
Hi Rob, do you have any perspective of the (non)usefulness of using a different volatility measure than stddev for Sharpe (e.g. CVaR), consistent with a 95% CVaR as part of a 'Modified Sharpe'? Seems this would have more appeal when one is more concerned about the tails of the non-normal returns, esp with the volatility products like VIX.
ReplyDelete
Replies
Simon27 February 2018 at 17:56
Dear Rob, Thanks for sharing your work in your books and via the website. I have a question regarding the weights of the trading rules. I am interested in seeing how closely the handcrafted weights would compare to the minimum variance portfolio weights using correlations (and assuming the same volatility for each rule). I have been trying to take some examples of correlation tables (representing a possible set of trading rules) where I have assigned correlations ranging between 0 and 1, typically 0.5 to 0.75. I then calculate the minimum variance weights by inverting the covariance matrix (or correlation matrix as vols are the same). When I calculate the weights I find several have a negative weight. This doesn't make sense in the context of rules as we would not 'short' a trading rule - instead we would discard it or reverse it. Is there an easy way to adjust for the constraint of needing all weights >+ zero? Many thanks, Simon
ReplyDelete
Replies
Simon27 February 2018 at 17:57
This comment has been removed by a blog administrator.
ReplyDelete
Replies
Unknown25 November 2019 at 23:44
Robert, I bought your book recently and it's so good I can't put it down.

Hope I'm not too late to the dance for this blog post.

At present I'm trying to replicate the code to produce the final graph showing the Sharpe returns.

I get a similar graphing pattern: https://ibb.co/QHYyMqt

But notice the x-axis has 1e6. Large numbers like this also exist in the code output:

[[('min', '-9.796e+04'), ('max', '2.721e+04'), ('median', '86.45'), ('mean', '134.6'), ('std', '3899'), ('skew', '-2.784'), ('ann_mean', '3.445e+04'), ('ann_std', '6.239e+04'), ('sharpe', '0.5522'), ('sortino', '0.6348'), ('avg_drawdown', '-5.859e+04'), ('time_in_drawdown', '0.9573'), ('calmar', '0.1773'), ('avg_return_to_drawdown', '0.588'), ('avg_loss', '-2465'), ('avg_gain', '2566'), ('gaintolossratio', '1.041'), ('profitfactor', '1.113'), ('hitrate', '0.5167'), ('t_stat', '3.439'), ('p_value', '0.000586')], ('You can also plot / print:', ['rolling_ann_std', 'drawdown', 'curve', 'percent', 'cumulative'])]

Notice Sharpe ratio looks right but ann_std and other values are massive numbers.

1. Is there a full code example that shows this particular Sharpe graph and I've been stupid and missed it?
2. If not, is this likely due to me being unable to configure the system.config.forecast_correlation_estimate["func"] setting? [I get error AttributeError: module 'syscore' has no attribute 'correlations' if I uncomment that config line]
3. If not, Is it something else I'm missing from my code?

I know you're a very busy man, so I won't hold it against you if don't have time to respond.

But I'm so very close that any help, even a sentence or two response that points me in the right direction would be great.

Here is my code:

from matplotlib.pyplot import show, title
from systems.provided.futures_chapter15.estimatedsystem import futures_system
import syscore

system=futures_system()

system.config.forecast_weight_estimate["pool_instruments"]=True
system.config.forecast_weight_estimate["method"]="bootstrap"
system.config.forecast_weight_estimate["equalise_means"]=False
system.config.forecast_weight_estimate["monte_runs"]=200
system.config.forecast_weight_estimate["bootstrap_length"]=104
system.config.forecast_weight_estimate["ewma_span"]=125
system.config.forecast_weight_estimate["cleaning"]=True

system.config.forecast_correlation_estimate["pool_instruments"]=True
# system.config.forecast_correlation_estimate["func"]=syscore.correlations.CorrelationEstimator
system.config.forecast_correlation_estimate["frequency"]="W"
system.config.forecast_correlation_estimate["date_method"]="expanding"
system.config.forecast_correlation_estimate["using_exponent"]=True
system.config.forecast_correlation_estimate["ew_lookback"]=250
system.config.forecast_correlation_estimate["min_periods"]=20

system.config.forecast_div_mult_estimate["ewma_span"]=125
# system.config.forecast_div_mult_estimate["floor_at_zero"]=True

system=futures_system(config=system.config)

print(system.accounts.portfolio().stats())

system.accounts.portfolio().cumsum().plot()

show()
ReplyDelete
Replies
Chad B15 March 2020 at 18:00
Hi Rob,
What might be a legitimate way to deal with unequal-length and/or asynchronous time series, specifically for correlation calculations? My brief search yielded the notion of truncating the longer series to match the shorter one.

This could be applicable to cross-correlations for say NQ-100 versus DAX or Hang Seng Index futs, but also for markets on the same continent but different session hours & durations (e.g. ICE Coffee vs WTI Crude).
ReplyDelete
Replies
Anonymous24 June 2020 at 15:57
Just bought your book about a week ago. Really useful stuff Rob. Really apreciate it. Just getting started with trading. Hopefully will make some things work out.

Thanks!!!
ReplyDelete
Replies
Anonymous29 June 2020 at 22:41
Hi Rob! Just have a question about the chapter 15. I am using your spreadsheets (https://www.systematicmoney.org/systematic-trading-resources) but I have one big question. Maybe is something silly but I can't figure it out.

When you talk about "Each point is worth (c)" in the trading dairy I don't know where does this come from. In your book it just says: <**For Euro Stoxx a point move in the futures price cost 10 Euros>.
But 1% of a future worth 3370 is 33.70...
And then I don´t understand why is the point (c) -in the trading diary- constant- on October 2014 and on December 2014 - in the spreadsheet as the price (b) changes.

Thankyour very much
ReplyDelete
Replies
Chad B13 January 2021 at 14:50
Hi Rob, have you experimented with conditional probability (e.g. Bayesian) as a substitute for standard correlation measures?

I've read elsewhere that "Correlation causes many conceptual misinterpretations, especially related to causal structures."
ReplyDelete
Replies
Michael Newton2 March 2021 at 19:29
Above, when you are weighting the different signals (carry and lookbacks), are you accounting for both Sharpe and correlation or just correlation?
It would seem that the correlations would favor the shortest and longest lookback since they are more diversifying but in theory the middle lookback should have a higher Sharpe because that should be the hump of a curve with Sharpe falling off as you approach mean reversion as the lookback gets too short or too long.
ReplyDelete
Replies
Wengwg26 April 2022 at 06:09
Hi Rob,I have a question about estimating forecast weight with bootstrapping.We need expanding window performance curve for each rule variation for each instrument, right? (with some abstract notional capital and volatility target, etc). And first,we calculate forecasts then scale and limit them to -20 ~ +20.
My question is,how we get forecast scalars? If we use all sample data to estimate forecast scalars,then make backtest on these sample,It seems like we are using something from future,because we wouldn't actually know these "latest" forecast scalars at the beginning of the backtest.
Because we limit forecast to -20 ~ +20,performance curve would change if we use different forecast scalar.I know that forecast scalars seems change little over years,maybe it's not a big problem?
Apologies if I misunderstood.Thank you!
ReplyDelete
Replies
Sam Fisher13 July 2022 at 13:50
Hi Rob, I really appreciate that you're still replying to a blog written six years ago. So if you can see my comment, here is one question:

Am I understanding correctly that in the combined forecast stage you are using two different correlations?:
1) For the forecast weights calculation, be it bootstrapping or shrinkage, you use the correlation matrix calculated from the PERFORMANCE curves; and
2) For the fdm calculation, you use the correlation matrix calculated from the FORECAST values.
(the capitalization is supposed to be highlighting, instead of yelling, sorry about that)

Thank you very much!
ReplyDelete
Replies
Umut2 November 2023 at 22:45
Hi Rob,

I'm not sure if you are still monitoring this space but regardless, I really appreciate your sharing your wisdom with us here. I've been reading your book, Systematic Trading, for the last couple of weeks with great admiration for your work.

I'm struggling with calculating the forecast weights with bootstrapping. I've managed to generate scaled forecasts for my trading rules. now I'm at the stage of combining forecasts.

My question is:
i) Do I use p&l for each trading rule and try to optimise the profit in order to calculate forecast weights?
ii)If I use bootstrapping with pooling, for each run, do I calculate different set of weights per instrument that I'm pooling and get the average of these weights on each run and get a final average in the end to calculate the final weights?

Your help would be much appreciated.
Many thanks.
ReplyDelete
Replies

Add comment

Comments are moderated. So there will be a delay before they are published. Don't bother with spam, it wastes your time and mine.

Friday 29 January 2016

Correlations, Weights, Multipliers.... (pysystemtrade)

Key

Forecast weights

A quick recap

In or out of sample?

Choose your weapon: Shrinkage, bootstrapping or one-shot?

Boostrapping

Shrinkage

Single period

To pool or not to pool... that is a very good question

Estimating statistics

Checking my intuition

Smooth operator - how not to incur costs changing weights

Forecast diversification multiplier

Correlations

Smoothing, again

From subsystem to system

Instrument weights

Missing in action: dealing with incomplete data

Let's plot them

Instrument diversification multiplier

Missing in action, take two

Let's plot it

And finally...

End of post

153 comments:

Contact Me (Spam will be politely ignored)

Subscribe To