Learning the dynamics of technical trading strategies

Learning the dynamics of technical trading strategies


We use an adversarial expert based online learning algorithm to learn the optimal parameters required to maximise wealth trading zero-cost portfolio strategies. The learning algorithm is used to determine the relative population dynamics of technical trading strategies that can survive historical back-testing as well as form an overall aggregated portfolio trading strategy from the set of underlying trading strategies implemented on daily and intraday Johannesburg Stock Exchange data. The resulting population time-series are investigated using unsupervised learning for dimensionality reduction and visualisation. A key contribution is that the overall aggregated trading strategies are tested for statistical arbitrage using a novel hypothesis test proposed by Jarrow et al. (2012) on both daily sampled and intraday time-scales. The (low frequency) daily sampled strategies fail the arbitrage tests after costs, while the (high frequency) intraday sampled strategies are not falsified as statistical arbitrages after costs. The estimates of trading strategy success, cost of trading and slippage are considered along with an offline benchmark portfolio algorithm for performance comparison. In addition, the algorithms generalisation error is analysed by recovering a probability of back-test overfitting estimate using a nonparametric procedure introduced by Bailey et al. (2016). The work aims to explore and better understand the interplay between different technical trading strategies from a data-informed perspective.


nline learning; technical trading; portfolio selection; statistical arbitrage; back-test overfitting; Johannesburg Stock Exchange


G11, G14 and O55

1 Introduction

Maximising wealth concurrently over multiple time periods is a difficult task; particularly when combined with capital allocations that are made simultaneously with the selection of plausible candidate trading strategies and signals. An approach to combining strategy selection with wealth maximisation is to use online or sequential machine learning algorithms (Györfi et al. (2012)). Online portfolio selection algorithms attempt to automate a sequence of trading decisions among a set of stocks with the goal of maximising returns in the long run. Here the long run can correspond to months or even years, and is dependent on the frequency at which trading takes place.1 Such algorithms typically use historical market data to determine, at the beginning of a trading period, a way to distribute their current wealth among a set of stocks. These types of algorithms can use many more features than merely prices, so called “side-information”, but the principle remains that same.

The attraction of this approach is that the investor does not need to have any knowledge about the underlying distributions that could be generating the stock prices (or even if they exist). The investor is left to “learn” the optimal portfolio to achieve maximum wealth using past data directly (Györfi et al. (2012)).

Cover (1991) introduced a “follow-the-winner” online investment algorithm2 called the Universal Portfolio (UP) algorithm3. The basic idea of the UP algorithm is to allocate capital to a set of experts characterised by different portfolios or trading strategies; and to then let them run while at each iterative step to shift capital from losers to winners to find a final aggregate wealth.

Here our “experts” will be similarly characterised by a portfolio, but portfolios that proxies different trading strategies. Here a particular agent makes decisions independently of all other experts. The UP algorithm holds parametrised Constant Rebalanced Portfolio (CRP) strategies as its underlying experts. We will have a more generalised approach to generating experts. The algorithm provides a method to effectively distribute wealth among all the CRP experts such that the average log-performance of the strategy approaches the Best Constant Rebalanced Portfolio (BCRP) which is the hindsight strategy chosen which gives the maximum return of all such strategies in the long run. A key innovation was the provision of a mathematical proof for this claim based on arbitrary sequences of ergodic and stationary stock return vectors Cover (1991).

If some log-optimal portfolio exists such that no other investment strategy has a greater asymptotic average growth then to achieve this one must have full knowledge of the underlying distribution and of the generating process to achieve such optimality (Algoet and Cover (1988); Cover (1991); Cover and Ordentlich (1996); Györfi et al. (2008)). Such knowledge is unlikely in the context of financial markets. However, strategies which achieve an average growth rate which asymptotically approximates that of the log-optimal strategy are possible when the underlying asset return process is sufficiently close to being stationary and ergodic. Such a strategy is called universally consistent.

Györfi et al. (2008) proposed a universally consistent portfolio strategy and provided empirical evidence of a strategy based on nearest-neighbour based experts which reflects such asymptotic log-optimality. The idea is to match current price dynamics with similar historical dynamics using pattern matching. The pattern matching is implemented using a nearest-neighbour search algorithm to select parameters for experts. The pattern-matching algorithm was extended by Loonat and Gebbie (2018) in order to implement a zero-cost portfolio i.e. a long/short and self-financing portfolio. The algorithm was also re-cast to replicate near-real-time applications using look-up libraries learnt offline. However, there is a computational cost associated with coupling and creating offline pattern libraries and the algorithms are not truly online.

A key objective in the implementation of online learning here is that the underlying experts remain online and they can be sequentially computed on a moving finite data-window using parameters from the previous time-step. Here we have ignored the pattern-matching step in the aforementioned algorithms and rather propose our own expert generating algorithm using tools from technical analysis. Concretely, we replace the pattern-matching expert generating algorithm with a selection of technical trading strategies.

Technical analysis indicators are popular tools from technical analysis used to generate trading strategies (Chan (2009); Clayburg (2002)). They claim to be able to exploit statistically measurable short-term market opportunities in stock prices and volume by studying recurring patterns in historical market data (Creamer and Freund (2010); Chande and Kroll (1994); Rechenthin (2014)). What differentiates technical analysis from traditional time-series analysis is that it tends to place an emphasis on recurring time-series patterns rather than invariant statistical properties of times-series. Traditionally, technical analysis has been a visual activity, whereby traders study the patterns and trends in charts, based on price or volume data, and use these diagnostic tools in conjunction with a variety of qualitative market features and news flow to make trading decisions.

This is perhaps not dissimilar to the relationship between alchemy and chemistry, or astrology and astronomy, but in the setting of financial markets, but many studies have criticised the lack of a solid mathematical foundation for many of the proposed technical analysis indicators (Aronson (2007); Lo et al. (2000); Lo and Hasanhodzic (2009)). There has also been an abundance of academic literature, utilising technical analysis for the purpose of trading, and several studies have attempted to develop indicators and test them in a more mathematically, statistically and numerically sound manner (Aronson (2007); Kestner (2003); Park and Irwin (2004)). Much of this work is still viewed with some suspicion given that it is extremely unlikely that this or that particular strategy or approach was not the result of some sort of back-test overfitting (Hou et al. (2017); Bailey et al. (2014, 2016)).

Our work does not address the question: “Which, if any, technical analysis methods reveal useful information for trading purposes?”; rather, we aim to bag a collection of technical experts and allow them to compete in an adversarial manner using the online learning algorithm. This allows us to consider whether the resulting aggregate strategy can: 1.) pass reasonable tests for statistical arbitrage, and 2.) has a relatively low probability of being the result of back-test overfitting. Can the strategy be considered a statistical arbitrage, and can it generalise well out-of-sample?

Concretely, here we are concerned with the idea of understanding whether the collective population of technical experts can through time lead to dynamics that can be reasonably be considered a statistical arbitrage (Jarrow et al. (2012)), and then with a reasonably low probability of back-test overfitting (Bailey et al. (2016)). Can we generate wealth, both before costs, and then after costs, using the online aggregation of technical strategies? Then, what broad groups of strategies will emerge as being successful in the sense of positive trading profits with declining variance in losses?

Incorrectly accounting for costs will always be a plausible explanation for any apparently profitable trading strategy (see Loonat and Gebbie (2018)), but even after costs there still exists a high likelihood that there was some data-overfitting because we only have single price path from history with little or no knowledge about the true probability of the particular path that has been measured. While the adaptive nature of markets themselves are continually changing the efficacy of various strategies and approaches.

Rather than considering various debates relating the technicalities of market efficiency, where one is concerned with the expeditiousness of market prices to incorporate new information at any time, and where information is explicitly exogenous; we restrict ourselves to market efficiency in the sense used by Fischer Black (Black (1986); Bouchaud et al. (2017); Aronson (2007)). This is the situation where some of the short-term information is in fact noise, and that this type of noise is a fundamental property of real markets. Although market efficiency may plausibly hold over the longer term, in the short-term there may be small departures that are amenable to tests for statistical arbitrage (Jarrow et al. (2012)), departures that create incentives to trade, and more importantly, departures that may not be easily traded out of the market due to various asymmetries in costs, market structure and market access.

In order to analyse whether the overall back-tested strategy depicts a candidate statistical arbitrage, we implement a test first proposed by Hogan et al. (2004) and further refined by Jarrow et al. (2012). Hogan et al. (2004) provide a plausible technical definition of statistical arbitrage based on a vanishing probability of loss and variance in the trading profits, and then use this to propose a test for statistical arbitrage using a Bonferroni test (Hogan et al. (2004)). This methodology was extended and generalised by Jarrow et al. (2012) to account for the asymmetry between desirable positive deviations (profits) and undesirable negative deviations (losses), by including a semi-variance hypothesis instead of the originally constructed variance hypothesis, which does not condition on negative incremental deviations. The so-called Min- statistic is computed, and used in conjunction with a Monte Carlo procedure, to make inferences regarding a carefully defined “no statistical arbitrage” null hypothesis.

This is analogous to evaluating market efficiency in the sense of the Noisy efficient market hypothesis (Black (1986)) whereby a failure to reject the no statistical arbitrage null hypothesis will result in concluding that the market is in fact sufficiently efficient and no persistent anomalies can be consistently exploited by trading strategies over the long term. Traders will always be inclined to employ strategies which depict a statistical arbitrage and especially strategies which have a probability of loss that declines to zero quickly as such traders will often have limited capital and short horizons over which they must provide satisfactory returns (profits) (Jarrow et al. (2012)).

We make the effort here to be very clear that we do not attempt to identify profitable (technical) trading strategies nor to make any claims about the informational value of technical analysis, but rather we will generate a large population of strategies, or experts, constructed from various technical trading rules, and combinations of the associated parameters of these rules, in the attempt to learn something about the aggregate profitability of the population dynamics of the set of experts.

Expert’s will generate trading signals, i.e. buy, sell or hold decisions for each stock held in their portfolio as based on the underlying parameters and the necessary historic data implied by the parameter’s. Once trading signals for the current time period have been generated by a given expert, a methodology to transform the signals into a set of portfolio weights, or controls, is required.

We introduce a transformation method that computes controls proportional to the relative volatilities of the stocks for which non-zero trading signals were generated, and then normalise the resulting values such that the self-financing and leverage constraints required by the algorithm are satisfied. The resulting controls are then utilised to compute the corresponding expert wealth’s. The experts who accumulate the greatest wealth during a trading period will receive more wealth in the following trading period and thus contribute more to the final aggregated portfolio. This can be best thought of as some sort of “fund-of-funds” over the underlying collection of trading strategies.

This is a meta-expert that aggregates experts that represent all the individual technical trading rules. The overall meta-expert strategy performance is achieved by the online learning algorithm. We explicitly provide equity curves for the individual expert’s portfolios, given as the accumulated trading profit through time, along with performance curves for the overall strategy’s wealth and the associated profits and losses.

We perform a back-test of the algorithm on two different data sets over two separate time periods: 1.) one using daily data over a six-year period, and 2.) the other using a mixture of intraday and daily data over a two-month period. A selection of the fifteen most liquid stocks which constitute the Johannesburg Stock Exchange (JSE) Top 40 shares is utilised for the two separate implementations4.

The overall strategy performance is compared to the BCRP strategy to form a benchmark comparison to evaluate the success of our strategy. The overall strategy is then tested for statistical arbitrage to find that in both a daily, and intraday-daily data implementation, the strategy depicts a statistical arbitrage before costs.

A key point here, as in Loonat and Gebbie (2018), is that it does seem that plausible statistical arbitrages are detected on the meso-scale and in the short-term. However, after accounting for reasonable costs, only the short term trading strategies seem to pass statistical arbitrage tests. This does not imply profitability as these remaining strategies my be structural and hence may not be easily profitably traded out of the system.

Finally, we analyse the generalisation error of the overall strategy to get a sense of whether or not the strategy conveys back-test overfitting, by estimating the probability of back-test overfitting (PBO) inherent in multiple simulations of the algorithm on subsets of historic data.

The paper will proceed as follows: section 2 explains the construction of the algorithm including details of how the experts are generated, how their corresponding trading signals are transformed into portfolio weights and a step-by-step break-down of the learning algorithm. In section 3, we introduce the concept of a statistical arbitrage, including the methodology for implementing a statistical arbitrage test, calculating the probability of loss and estimating the PBO for a trading strategy. All experiment results and analyses of implementations of the algorithm are presented in section 5. section 6 states all final conclusions from the experiments and possible future work.

In summary, we are able to show that on a daily sampled time-scale there is most likely little value in the aggregate trading of the technical strategies of the type considered here. However, on intraday times-scale things look slightly different. Even with reasonable costs accounted for there still seems to be the possibility that price based technical trading cannot be ruled out as offering an avenue for statistical arbitrage. However, considerable care is still required to ensure that one is accounting for the full complexity of market realities that can often make it practically difficult to arbitrage these sorts of apparent trading opportunities out of the market as they may be the results of top-down structure and order-flow itself, rather than some notion of information inefficiency, but are rather the signature of noise trading.

2 Learning Technical Trading

Rather than using a back-test in the attempt to find the single most profitable strategy, we produce a large population of trading strategies – ”experts” – and use an adaptive algorithm to aggregate the performances of the experts to arrive at a final portfolio to be traded. The idea of the online learning algorithm is to consider a population of experts created using a large set of technical trading strategies generated from a variety of parameters and to form an aggregated portfolio of stocks to be traded by considering the wealth performance of the expert population.

During each trading period, experts trade and execute either a buy (1), sell (-1) or hold (0) actions. These actions are independent of one another and based on each of the individual experts strategies. The trade signals, , are transformed into a set of portfolio weights such that their sum is identical to zero; this ensures that the strategy can be self-funding. We also require that the portfolio is unit leveraged, and hence the absolute sum of controls is equal to one. This is to avoid having to introduce a margin account into the trading mechanics.5

Based on each individual expert’s accumulated wealth up until some time , a final aggregate portfolio for the next period is formed by creating a performance weighted combination of the experts. Experts who perform better in period will have a larger relative contribution toward the aggregated portfolio to be implemented in period than those who perform poorly. Below, we describe the methodology for generating the expert population.

2.1 Expert Generating Algorithm

Technical trading

Technical trading refers to the practice of using trading rules derived from technical analysis indicators to generate trading signals. Here, indicators refer to mathematical formulas based on Open-High-Low-Close (OHLC) price bars, volume traded or a combination of both (OHLCV). An abundance of technical indicators and associated trading rules have been developed over the years with mixed success. Indicators perform differently under different market conditions and different human operators, which is why traders will often use multiple indicators to confirm the signal that one indicator gives on a stock with another indicators signal. Thus, in practice and various studies in the literature, many trading rules generated from indicators are typically back tested on a significant amount (typically thousands of data points) of historical data to find the rules that perform the best6. It is for this reason that we consider a diverse set of technical trading rules. In addition to the set of technical trading strategies, we implement three other popular portfolio selection algorithms each of which has been adapted to generate zero-cost portfolio controls. These three rules generically conform to a combination of chartists trend followers, fundamentalist contrarians, and short-term correlation traders. An explanation of these three algorithms is provided in appendix B while each of the technical strategies are described in appendix A.

In order to produce the broad population of experts, we consider combinations among a set of four model parameters. The first of these parameters is the underlying strategy of a given expert, , which corresponds to the set of technical trading and trend-following strategies where the total number of different trading rules is denoted by . Each of the rules require at most two parameters each time a buy, sell or hold signal is computed at some time period . The two parameters represent the number of short and long-term look-back periods necessary for the indicators used in the rules. These parameters will determine the amount of historic data considered in the computation of each rule. We will denote the vector of short-term parameters by and the long-term parameters by which make up two of the four model parameters. Let and 7 be the number of short-term and long-term look-back parameters respectively. Also, we denote the number of trading rules which utilise one parameter by and the number of trading rules utilising two parameters by and hence .

The final model parameter, denoted by c, refers to object clusters where is the object cluster and is the number of object clusters. We will consider four object clusters (); the trivial cluster which contains all the stocks and the three major sector clusters of stocks on the JSE, namely, Resources, Industrials and Financials.8

The algorithm will loop over all combinations of these four model parameters calling the appropriate strategies, , stocks, c, and amount of historic data, and , to create a: buy, sell or hold signal at each time period . Each combination of for , for , for and for will represent an expert. It should be clear that some experts may trade all the stocks, i.e use the trivial clusters, and others will trade subsets of the stocks, i.e resources, industrials and financials. It is also important to note that for rules requiring two parameters, the loop over the long-term parameters will only activate at indices for which where and represent the loop index over the short and long-term parameters respectively. The total number of experts, , is then given by

We will denote each expert’s strategy9 by which is an vector representing the portfolio weights of the expert for all stocks and the risk-free asset at time . Here, refers to the chosen number of stocks to be passed into the expert generating algorithm. As mentioned above, from the set of stocks, each expert will not necessarily trade all stocks (unless the expert trades the trivial cluster), since of those stocks, only a hand full of stocks will fall into a given sector constituency. This implies that even though we specify each expert’s strategy () to be an , we will just set the controls to zero for the stocks which the expert does not trade in their portfolio. Denote the expert control matrix made up of all experts’ strategies at time for all stocks i.e. . In order to choose the stocks to be traded, we take the most liquid stocks over a specified number of days, denoted by . We make the choice of using average daily volume (ADV) as a proxy for liquidity.10 ADV is simply the average volume traded for a given stock over a period of time. The ADV for stock over the past periods is


where is the volume of the stock at period . The stocks with the largest ADV will then be fed into the algorithm for trading.

Transforming signals into weights

In this section, we describe how each individual expert’s set of trading signals at each time period are transformed into a corresponding set of portfolio weights (controls) which constitute the expert’s strategy (). For the purpose of generality, we refer to the stocks traded by a given expert as even though the weights of many of these stocks will be zero for multiple periods as the expert will only be considering a subset of these stocks depending on which object cluster the expert trades.

Suppose it is currently time period and the expert is trading stocks. Given that there are stocks in the portfolio, trading signals will need to be produced at each trading period. The risk-free assets purpose will be solely to balance the portfolio given the set of trading signals. Given the signals for the current time period and previous period , all hold signals at time are replaced with the corresponding non-zero signals from time as the expert retains his position in these stocks11. All non-hold signals at time are of course not replaced by the previous periods signals as the expert has taken a completely new position in the stock. This implies that when the position in a given stock was short at period for example and the current periods () signal is long then the expert takes a long position in the stock rather than neutralising the previous position. Before computing the portfolio controls, we compute a combined signal vector made up of signals from time period and time using the idea discussed above. We will refer to this combined set of signals as output signals. We then consider four possible cases of the output signals at time for a given expert:

  1. All output signals are hold (0)

  2. All output signals are non-negative (0 or 1)

  3. All output signals are non-positive (0 or -1)

  4. There are combinations of buy, sell and hold signals (0, 1 and -1) in the set of output signals

Due to the fact that cases II (long-only) and III (short-only) exist, we need to include a risk-free asset in the portfolio so that we can enforce the self-financing constraint; the controls must sum to zero . We refer to such portfolios as zero-cost portfolios. Additionally, we implement a leverage constraint by ensuring that the absolute value of the controls sum to unity: .

For case I, we set all stock weights and the risk-free asset weight to zero so that the expert does not allocate any capital in this case since the output signals are all zero.

For the remaining cases we need to find asset weights that satisfy the investment constraints. From the asset class standard deviations we can define buy (and sell) weight allocation to stocks with positive signals (negative signal) to give by long-asset (short-asset) weights:


We will use these weights based on the signals to generate trading positions.

For case II, we compute the standard deviations of the set of stocks which resulted in buy (positive) signals from the output signals using their closing prices over the last 90 days for daily trading and the last 90 trading periods for intraday-daily trading12 and use these standard deviations to allocate a weight that is proportional to the volatility (more volatile stocks receive higher weight allocations).

The number of buy signals, from the set of output signals, can be denoted by and the vector of standard deviations of stocks with non-zero output signals is given by . Then the weight allocated to stocks with positive signals is given by positive signal form of equation (equation 2). Here the lowest value of corresponds to the least volatile stock and vice versa for large . This equation ensures that . We then short the risk-free asset with a weight of one half . This allows us to borrow using the risk-free asset and purchase the corresponding stocks within which we take a long position.

Case III is similar to Case II above, however instead of having positive output signals, all output signals are negative. Again, we compute standard deviations of the set of stocks which resulted in sell (negative) signals from the output signals using their closing prices over the last 120 trading periods. Let the number of sell signals from the set of output signals be denoted by and denote the vector of standard deviations of stocks to be sold by . Then the weight allocated to stocks which have short positions is given by the negative signal form of equation (equation 2). We then take a long position in the risk-free asset with a weight of one half .

For case IV, we use the similar methodology to that discussed above in Case II and III. To compute the weights for the short assets we use the negative signal formula in equation (equation 2); similarly, for the long assets we use positive signal formula from equation (equation 2). We then set the risk-free rate to be equal to in order to enforce the self-financing and fully invested constraints. Finally, assets which had hold signals have their weights are set to zero.

The method described above is what we will refer to as the volatility loading method for transforming signals into controls. A second method is considered, called the inverse volatility loading method, and is defined similarly to the method described above, however, instead of multiplying through by the volatility vector in each of the above cases, we multiply through by the inverse of the volatility vector (element-wise inverses). We will not implement the inverse volatility loading method in this study as the results of the two methods are similar.

2.2 Online Learning Algorithm

Given that we now have a population of experts, each with their own controls, , we implement the online learning algorithm to aggregate the expert’s strategies at time based on their performance and form a final single portfolio to be used in the following period which we denote . The aggregation scheme used is inspired by the Universal Portfolio (UP) strategy taken from the work done by Cover (1991); Cover and Ordentlich (1996) and a modified version proposed by Györfi et al. (2008). Although, due to the fact that we have several different base experts as defined by the different trading strategies rather than Cover’s (see Cover (1991)) constant rebalanced UP strategy, our algorithm is better defined as a meta-learning algorithm (Li and Hoi (2016)). We use the subscript since the portfolio is created using information only available at time even though the portfolio is implemented in the following time period. The algorithm will run from the initial time 13 which is taken to be 2 until terminal time . is required to ensure there is sufficient data to compute a return for the first active trading day. We must point out here that experts will only actively begin making trading decisions once there is sufficient data to satisfy their look-back parameter(s) and subsequently, since the shortest look-back parameter is 4 periods, the first trading decisions will only be made during day 5. The idea is to take in stock’s OHLCV values at each time period which we will denote by . We then compute the price relatives at each time period given by where and where is the closing price of stock at time period . Expert controls are generated from the price relatives for the current period to form the expert control matrix . From the corresponding expert control matrix, the algorithm will then compute the expert performance which is the associated wealth of all experts at time . Denote the expert’s wealth at time by . We then form the final aggregated portfolio, denoted by , by aggregating the expert’s wealth using the agent mixture update rules.

The relatively simplistic learning algorithm is incrementally implemented online but offline it can be parallelised across experts. Given the expert controls from the Expert Generating Algorithm (), the online learning algorithm is implemented by carrying out the steps (Loonat and Gebbie (2018)):

  1. Update portfolio wealth: Given the portfolio control for the asset at time , we update the portfolio wealth for the period


    represents the compounded cumulative wealth of the overall aggregate portfolio and will denote the corresponding vector of aggregate portfolio wealth’s over time. Here the realised price relatives for the period and the asset, , are combined with the portfolio controls for the previous period to obtain the realised portfolio returns for the current period . is in fact the profits and losses for the current trading period . Thus, we will use it to update the algorithms overall cumulative profits and losses which is given by

  2. Update expert wealth: The expert controls were determined at the end of time-period for time period by the expert generating algorithm for experts and objects about which the experts make expert capital allocation decisions. At the end of the time period the performance of each expert , , can be computed from the change in the price relatives for the each of the objects in the investment universe considered using the closing prices at the start, , and the end of the time increment, , using the expert controls.

  3. Update expert mixtures: We consider a UP inspired expert mixture update rule as follows (Cover (1991); Györfi et al. (2008); Loonat and Gebbie (2018)): the mixture of the expert for the next time increment, , is equivalent to the accumulated expert wealth up until time and will be used as the update feature for the next unrealised increment subsequent appropriate normalisation

  4. Renormalise expert mixtures: As mentioned previously, we will consider experts such that the leverage is set to unity for zero-cost portfolios: 1.) and 2.) . We will not consider the long-only experts (absolute experts as in Loonat and Gebbie (2018)), but only consider experts whom satisfy the prior two conditions which we will refer to as active experts. This in fact allows for shorting of one expert against another; then due to the nature of the mixture controls, the resulting portfolio becomes self-funding.

  5. Update portfolio controls: The portfolio controls are updated at the end of time period for time period using the expert mixture controls from the updated learning algorithm and the vector of expert controls for each expert from the expert generating algorithms using information from time period . We then take a weighted average over all experts by taking the sum with respect to


The strategy is to implement the portfolio controls, wait until the end of the current time increment, measure the features (OHLCV values), update the experts and then re-apply the learning algorithm to compute the expert mixtures and portfolio controls for the next time increment.

2.3 Algorithm implementation for intraday-daily trading

Intraday trading poses a whole new set of new issues that need to be considered. The first relates to the how the actual trading occurs. The second how to deal with spurious data. It is not as straight-forward as substituting down-sampled transaction data into an algorithm’s built and tested on uniformly sampled daily data. Uniformly sampled daily closing auction data is not equivalent to uniformly sampled intraday bar-data. The end-of-day price discovery process is entirely different to that found intra-day; the prior is a closing auction, the latter has prices being the result of continuous-time trading in a double auction. In addition to this, the first and last data-points of each day lead to over-night gap effects that will lead to spurious signals if not aggregated correctly over multiple days. Rather than dealing with the full complexity of a money management strategy, the main issue that we are concerned with will be the over-night gap effect which relates to the deviation in the prices at the end of day and the start of day . We implement the learning algorithm on a combination of daily and intraday data, whereby decisions made on the daily time scale are made completely independent of those made on the intraday time scale but the dynamics of the associated wealth’s generated by the processes are aggregated. We will refer to trading using a combination of daily and intraday data as intraday-daily trading.

The best way to think about it is to consider the experts as trading throughout the day, making decisions based solely on intraday data while compounding their wealth, and once a trading decision is made at the final time bar, the expert makes one last trading decision on that day based on daily historic OHLCV data, where the look-back periods will be based on passed trading days and not on any time bars for that day. The daily trading decision can be thought of as representing the last time bar of the day, where we are just using different data to make the decision. The methodology for each of the intraday and daily trading mechanisms are almost exactly as explained in section 2.2 above, however, there are necessary alterations to the algorithm. As in the daily data implementation, a given expert will begin making trading decisions as soon as there is a sufficient amount of data available to them. Here, we begin the algorithm from day two so that there is sufficient data to compute at least one return on the daily time scale. We then loop over the intraday time bars from 9:15am to 4:30pm on each given day.

To introduce some notation for intraday-daily trading, let 14 be the expert wealth vector for all experts for the time bar on the day and denote by the associated expert control matrix. The superscript refers to the ’fused’ daily and intraday matrices. More specifically, will contain the 88 intraday expert controls followed by the end of day expert controls based on daily closing OHLCV data for each given day over the trading horizon. Denote as the final time bar in a day (4:30pm). The experts wealth accumulated up until the final time bar 15 on day , , is calculated from the column of the expert control matrix, denoted , from the previous period and is computed solely from intraday data for day . Overall portfolio controls for the final intraday trade on day () are computed as before along with the overall portfolio wealth . This position is held until the close of the day’s trading when the closing prices of the stocks are revealed. Once the closing prices are realised, the final intraday position is closed. That is, an offsetting trade of is made at the prices . This profit/loss is then compounded onto . Thus, no intraday positions are held overnight. The experts will then make one final trading decision based on the daily OHLCV data given that the closing price is revealed and will look-back on daily historic OHLCV data to make these decisions. The expert’s wealth is updated using controls . The corresponding portfolio controls for all stocks are computed for the daily trading decision on day to be implemented at time (), the returns (price relatives) for day are computed () and the cumulative wealth is where with . The daily position is then held until the end of the following day or possibly further into the future (until new daily data portfolio allocations are made). This completes trading for day .

At the beginning of day , the expert wealth 16 is set back to unity. Setting experts wealth back to 1 at the beginning of the day, rather than compounding on the wealth from the previous day, is due to the fact that learning on intraday data between days is not possible due to the fact that conditions in the market have completely changed. Trading will begin by computing expert controls for the second time bar, however all experts will not have enough data to begin trading since the shortest look-back parameter is 4 and hence controls will all be set to zero. As the trading day proceeds, experts will begin producing non-zero controls as soon as there is sufficient data to satisfy the amount of data needed for a given look-back parameter. Something to note here is that due to the fact that the STeFI index17 (risk-free asset) is only posted daily, we utilise the same STeFI value for trading throughout the day. Finally, in order to differentiate between daily OHLCV data and intraday OHLCV data, we will denote them as and respectively.

2.4 Online Portfolio Benchmark Algorithm

To get an idea of how well our online algorithm performs, we compare its performance to that of the offline BCRP. As mentioned previously, the hindsight CRP strategy chosen which gives the maximum return of all such strategies in the long run. To find the portfolio controls of such a strategy, we perform a brute force Monte Carlo approach to generate 5000 random CRP strategies on the entire history of price relatives and choose the BCRP strategy to be the one that returns the maximal terminal portfolio wealth. As a note, here, the CRP strategies we consider are long-only.

2.5 Transaction Costs and Market Frictions

Apart from the (direct) transaction fees (commissions) charged by exchanges for the trading of stocks, there are various other costs (indirect) that need to be considered when trading such assets. Each time a stock is bought or sold there are unavoidable costs and it is imperative that a trader takes into account these costs. The other three most important components of these transaction costs, besides commissions charged by exchanges, are the spread18, price impact and opportunity cost (“The Hidden Costs of Trading” (n.d.)).

To estimate indirect transaction costs (TC) for each period , we will consider is the square-root formula (Gatheral (2010))


  1. Volatility of the returns of a stock (): See section 2.5.1 below.

  2. Average daily volume of the stock (ADV): ADV is computed using the previous 90 days trading volumes for daily trading and the previous 5 days intraday trading volume for intraday-daily trading.

  3. Number of shares traded (): The number of shares traded () is taken to be 1bp of ADV for each stock per day for daily trading. The number of stocks traded for intraday-daily trading is assumed to be 70bps of ADV for the entire portfolio per day which is then split evenly among all active trading periods during the day, to get 70bps/85 (assuming 88 5-minute time bars per day) of ADV per stock, per trading period.

  4. Spread: Spread is assumed to be 1bps per day (1%% /pd) for daily trading. For intraday-daily trading, we assume 20bps per day, which we then split evenly over the day to incur a cost of 0.002/85 per time bar.

  5. : Number of times the trading signal changes from a buy (sell) to a sell (buy,) for all stocks in the portfolio, over consecutive trading periods.19

The use of the square-root rule in practice dates back many years and is often used as a pre-trade transaction cost estimate (Gatheral (2010)). The first term in section 2.5 can be regarded as the term representing the slippage20 or temporary price impact and results due to our demand for liquidity (Huberman and Stanzl (2005)). This cost will only impact the price at which we execute our trade at and not the market price (and hence the price of subsequent transactions). The second term in section 2.5 is the (transient) price impact which will not only affect the price of the first transaction but also the price of subsequent transactions by other traders in the market, however, the impact decays over time as a power-law (Bouchaud et al. (2004)). In the following subsection, we will discuss how the volatility () is estimated for the square-root formula. Technically, , and ADV in section 2.5 should each be defined by a vector representing the volatilities, number of stocks traded and ADV of each stock in the portfolio respectively however for the sake of generality we will write it as a constant thus representing the volatility for a single portfolio stock.

Each trade (say an entry) is consider a child order of the trading strategy across the period. This is not really a faithful representation but was chosen in order to optimise the historic simulation. It should be realised that each child-order is not of a fixed size, as this is determined by the algorithm. However, at the end of the day we have shares traded, and entry and exit pairs.

In addition to the indirect costs associated with slippage and price impact as accounted for by the square-root formula, we include direct costs such as the borrowing of trading capital, the cost of regulatory capital and the various fees associated with trading on the JSE (Loonat and Gebbie (2018)). Such costs will also account for small fees incurred in incidences where short-selling has taken place. For the daily data implementation, we assume a total direct cost of 4bps per day. This assumption is purely made to approximately match the total daily transaction cost assumption made by Loonat and Gebbie (2018). For the intraday-daily implementation a total direct cost of 70bps per day is assumed (following Loonat and Gebbie (2018)) which we then split evenly over each day’s active trading periods (85 time bars since first expert only starts trading after the time bar) to get a cost of 70bps/85 per period. These costs are indicative and actual price impact requires either real trading experiments or full-scale market simulation - both are intractable in the context of our approach.

For daily trading, we recover an average daily transaction cost of roughly 17.75bps which is almost double the 10bps assumed by Loonat and Gebbie (2018). Loonat and Gebbie argue that for intraday trading, it is difficult to avoid a direct and indirect cost of about 50-80bps per day, in each case, leaving a conservative estimate of total costs to be approximately 160bps per day. We realise an overall average cost per period of 2.17bps, while the average cost per day assuming we trade for 85 periods throughout each day is roughly 184bps (85*2.17) for intraday-daily trading.

Volatility Estimation for Transaction Costs

In this section, we will discuss different methods for calculating the estimates for volatility () for daily and intraday data in the square-root formula (section 2.5).

Daily Data Estimation The volatility of daily prices at each day is taken to be the standard deviation of closing prices over the last 90 days. If 90 days have not passed, then the standard deviation will be taken over the number of days available so far.

Intraday Data Estimation The volatility for each intraday time bar on day is dependent on the time of day. For the first 15 time bars, the volatility is taken to be a forecast of a GARCH(1,1) model which has been fitted on the last 60 returns of the previous day . The reason for this choice is that the market is very volatile during the opening hour as well as the fact that there will be relatively few data points to utilise when computing the volatility. The rest of the day’s volatility estimates are computed using the Realised Volatility (RV) method (Andersen et al. (2001)). RV is one of the more popular methods for estimating volatility of high-frequency returns21 computed from tick data. The measure estimates volatility by summing up intraday squared returns at short intervals (eg. 5 minutes). Andersen et al. (2001) propose this estimate for volatility at higher frequencies and derive it by showing that RV is an approximate of quadratic variation under the assumption that log returns are a continuous time stochastic process with zero mean and no jumps. The idea is to show that the RV converges to the continuous time volatility (quadratic variation) (Poon (2008)), which we will now demonstrate.

Assume that the instantaneous returns of observed log stock prices () with unobservant latent volatility () scaled continuously through time by a standard Wiener process () can be generated by the continuous time martingale (Poon (2008))


It follows that the conditional variance of the single period returns, are:


This is also known as the integrated volatility for the period to . Suppose the sampling frequency of the tick data into regularly spaced time intervals is denoted by so that between period and there are continuously compounded returns, then . Hence, we can estimate the Realised Volatility (RV) based on intraday returns between periods and as


The argument here is that, provided we sample at frequent enough time steps (), the volatility can be observed theoretically from the sample path of the return process and hence (Karatzas and Shreve (1991); Poon (2008))


which says that the RV of a sequence of returns asymptotically approaches the integrated volatility and hence the RV is a reasonable estimate of current volatility levels.

3 Testing for Statistical Arbitrage

To test the overall trading strategy for statistical arbitrage, we implement a novel statistical test originally proposed by Hogan et al. (2004) and later modified by Jarrow et al. (2012), by applying it to the overall strategy’s profit and losses PL. The idea is to axiomatically define the conditions under which a statistical arbitrage exists and assume a parametric model for incremental trading profits in order to form a null hypothesis derived from the union of several sub-hypotheses which are formulated to facilitate empirical tests of statistical arbitrage. The modified test, proposed by Jarrow et al. (2012), called the Min- test, is derived from a set of restrictions imposed on the parameters defined by the statistical arbitrage null hypothesis and is applied to a given trading strategy to test for statistical arbitrage. The Min- statistic is argued to provide a much more efficient and powerful statistical test compared to the Bonferroni inequality used in Hogan et al. (2004). The lack of statistical power is reduced when the number of sub-hypotheses increases and as a result, the Bonferroni approach is unable to reject an incorrect null hypothesis leading to a large Type II error.

To set the scene and introduce the concept of a statistical arbitrage, suppose that in some economy, a stock (portfolio)22 and a money market account 23 are traded. Let the stochastic process represent a zero initial cost trading strategy that trades units of some portfolio and units of the money market account at a given time . Denote the cumulative trading profits at time by . Let the time series of discounted cumulative trading profits generated by the trading strategy be denoted by where for each . Denote the increments of the discounted cumulative profits at each time by . Then, a statistical arbitrage is defined as:

Definition 1 (Statistical Arbitrage (Hogan et al. (2004); Jarrow et al. (2012))).

A statistical arbitrage is a zero-cost, self-financing trading strategy () with cumulative discounted trading profits such that:

  1. ,

In other words, a statistical arbitrage is a trading strategy that 1) has zero initial cost, 2) in the limit has positive expected discounted cumulative profits, 3) in the limit has a probability of loss that converges to zero and 4) variance of negative incremental trading profits (losses) converge to zero in the limit. It is clear that deterministic arbitrage stemming from traditional financial mathematics is in fact a special case of statistical arbitrage (De Wit (2013)).

In order to test for statistical arbitrage, assume that the incremental discounted trading profits evolve over time according to the process


where . There are two cases to consider for the innovations: 1) i.i.d N(0,1) normal uncorrelated random variables satisfying or 2) follows an MA(1) process given by:


in which case the innovations are non-normal and correlated. Here, is an i.i.d. N(0,1) normal uncorrelated random variable. It is also assumed that and, in the case of our algorithm, = 0. We will refer to the first model (normal uncorrelated innovations) as the unconstrained mean (UM) model and the second model (non-normal and correlated innovations) as the unconstrained mean with correlation (UMC) model. Furthermore, we refer to the corresponding models with as the constrained mean (CM) and constrained mean with correlation (CMC) respectively, which assume constant incremental profits over time, and hence have an incremental profit process given by:


The discounted cumulative trading profits for the UM model at terminal time , discounted back to the initial time, which are generated by a trading strategy are given by


From equation 18, it is straightforward to show that the log-likelihood function for the discounted incremental trading profits is given by:


The probability of a trading strategy generating a loss after periods is as follows (Jarrow et al. (2012))


where denotes the cumulative standard normal distribution function. For the CM model, equation 20 is easily adjusted by setting and equal to zero. This probability converges to zero at a rate that is faster than exponential.

As mentioned previously, to facilitate empirical tests of statistical arbitrage under Definition 1, a set of sub-hypotheses are formulated to impose a set of restrictions on the parameters of the underlying process driving discounted cumulative incremental trading profits and are as follows:

Proposition 3.1 (UM Model Hypothesis (Jarrow et al. (2012))).

Under the four axioms defined in Definition 1, a trading strategy generates a statistical arbitrage under the UM model if the discounted incremental trading profits satisfy the intersection of the following four sub-hypotheses jointly: i.) , ii.) or , iii.), and .

An intersection of the above sub-hypotheses defines a statistical arbitrage, and as by De Morgan’s Laws24, the null hypothesis of no statistical arbitrage is defined by a union of the sub-hypotheses. Hence, the no statistical arbitrage null hypothesis is the set of sub-hypotheses which are taken to be the complement of each of the sub-hypotheses in Proposition 3.1:

Proposition 3.2 (UM Model Alternative Hypothesis (Hogan et al. (2004); Jarrow et al. (2012))).

Under the four axioms defined in Definition 1, a trading strategy does not generate a statistical arbitrage if the discounted incremental trading profits satisfy any one of the following four sub-hypotheses: i.) , ii.) or , iii.) , and iv.)

The null hypothesis is not rejected provided that a single sub-hypothesis holds. The Min- test is then used to test the above null hypothesis of no statistical arbitrage by considering each sub-hypothesis separately using the t-statistics and , where the hats denote the Maximum Likelihood Estimates (MLE) of the parameters. The Min- statistic is defined as (Jarrow et al. (2012))


The intuition is that the Min- statistic returns the smallest test statistic which is the sub-hypothesis which is closest to being accepted. The no statistical arbitrage null is then rejected if Min- where depends on the significance level of the test which we will refer to as . Since the probability of rejecting cannot exceed the significance level , we have the following condition for the probability of rejecting the null at the significance level


What remains is for us to compute the critical value . We will implement a Monte Carlo simulation procedure to compute which we describe in more detail in section 3.1 step 5 below.

3.1 Outline of the Statistical Arbitrage Test Procedure

The steps involved in testing for statistical arbitrage are outlined below:

  1. Trading increments : From the vector of cumulative trading profits and losses, compute the increments where .

  2. Perform MLE: Compute the likelihood function, as given in equation 19, and maximise it to find the estimates of the four parameters, namely, and . The log-likelihood function will obviously be adjusted depending on whether the CM or UM test is implemented. We will only consider the CM test in this study. Since MATLAB’s built-in constrained optimization algorithm25 only performs minimization, we minimise the negative of the log-likelihood function i.e. maximise the log-likelihood.

  3. Standard errors: From the estimated parameters in the MLE step above, compute the negative Hessian estimated at the MLE estimates which is indeed the Fisher Information (FI) matrix denoted by . In order to compute the Hessian, the analytical partial derivatives are derived from equation 19. Standard errors are then taken to be the square roots of the diagonal elements of the inverse of since the inverse of the Fisher information matrix is an asymptotic estimator of the covariance matrix.

  4. Min- statistic: Compute the t-statistics for each of the sub-hypotheses which are given by and and hence the resulting Min- statistic given by equation 21. Obviously, , and will not need to be considered for the CM test.

  5. Critical values: Compute the critical value at the significance level using the Monte Carlo procedure (uncorrelated normal errors) and Bootstrapping (correlated non-normal errors)

    1. CM model First, simulate 5000 different profit process using equation 17 with 26. For each of the 5000 profit processes, perform MLE to get estimated parameters, the associated t-statistics and finally the Min- statistics. is the taken to be the 1- quantile of the resulting distribution of Min- values.

  6. P-values: Compute the empirical probability of rejecting the null hypothesis at the significance level using equation 22 by utilising the critical value from the previous step and the simulated Min- statistics.

  7. n-Period Probability of Loss: Compute the probability of loss after periods for each and observe the number of trading periods it takes for the probability of loss to converge to zero (or below 5% as in the literature). This is done by computing the MLE estimates for the vector () for each given and substituting these estimates into equation 20.

3.2 Estimates of the Probability of Back-test Overfitting (PBO)

Bailey et al. (2014) heavily criticise recent studies which claim to have designed profitable investment or trading strategies since many of these studies are only based on in-sample (IS) statistics, without evaluating out-of-sample (OOS) performance. We briefly addressed this concern by computing estimate of the probability of back-test overfitting (PBO) using the combinatorially symmetric cross-validation (CSCV) procedure outlined in Bailey et al. (2016). Typically, an investor/researcher will run many () trial back-tests to select the parameter combinations which optimise the performance of the algorithm (usually based on some performance evaluation criterion such as the Sharpe Ratio). The idea is to perform CSCV on the matrix of performance series over time of length 27 for the separate trial simulations of the algorithm.

Here, we must be clear that when we refer to IS, we do not mean the “training set” per say, during which the moving average look-back parameters were calculated for example. Rather, we refer to IS as being the subset of observations utilised in selecting the optimal strategy from the back-test trials.

In the case of the algorithm proposed in this study, since the large set of trialled parameters form the basis of the learning algorithm in the form of the experts, we cannot observe the effect of different parameters settings on the overall strategy, as these are already built into the underlying algorithm. Rather, we will run trial back-test simulations on independent subsets of historical data to get an idea of how the algorithm performs on different subsets of unseen data. We can then implement the CSCV procedure on the matrix of profits and losses resulting from the trials to recover a PBO estimate. Essentially, there is no training of parameters taking place in our model, as all parameter combinations are considered, and the weights of the performance weighted average of the expert’s strategies associated with the different parameter combinations are “learnt”.

More specifically, we choose a back-test length for each subset and split the entire history of OHLCV data into subsets of this length. The learning algorithm is then implemented on each subset to produce profit and loss time series. Note that the subsets will be completely independent from one another as there is no overlapping of the data that each separate simulation is run on. The results from the simulations are presented in table 1 below.

Daily 30 60 days 1.4%
Intraday-daily 22 3 days 11.4%
Table 1: Number of back-test trials (), back-test length for each simulation () and the resulting PBO estimates for the daily and intraday-daily implementation.

4 The Data

4.1 Daily Data

The daily data is sourced from Thomson Reuters and contains data corresponding to all stocks listed on the JSE Top 4028. The data set consists of data for 42 stocks over the period 01-01-2005 to 29-04-2016 however we will only utilise the stocks which traded more than 60% of the time over this period. Removing such stocks leaves us with a total of 31 stocks. The data comprises of the opening price , closing price , lowest price , the highest price and daily traded volume (OHLCV). In additions to these 31 stocks, we also require a risk-free asset for balancing the portfolio. We make the choice of trading the Short Term Fixed Interest (STeFI) index. The STeFI benchmark is a proprietary index that measures the performance of Short Term Fixed Interest or money market investment instruments in South Africa. It is constructed by Alexander Forbes (and formerly by the South African Futures Exchange (SAFEX)) and has become the industry benchmark for short-term cash equivalent investments (up to 12 months) (“etfSA STeFI” (2011)).

4.2 Intraday-Daily Data

Bloomberg is the source of all tick (intraday) data used in this paper. The data set consists of 30 of the Top 40 stocks on the JSE from 02-01-2018 to 29-06-2018. The data is then sampled at 5-minute intervals to create an OHLCV entry for all 5-minute intervals over the 6-month period. We remove the first 10 minutes and last 20 minutes of the continuous trading session (9:00-16:50) as the market is relatively illiquid and volatile during these times which may lead to spurious trade decisions. We are thus left with 88 OHLCV entries for each stock on any given day. In addition to the intraday data, daily OHLCV data for the specified period is required for the last transaction on any given day. As in the daily data case, we make use of the STeFI index as the risk-free asset, and hence the daily entries for the STeFI index are included in this data set. The data was sourced from a Bloomberg terminal using the R Bloomberg API, Rblpapi, and all data processing is done in MATLAB to get the data into the required form for the learning algorithm.

5 Results and Analysis

5.1 Daily Data

In this section, we implement the various algorithms described above in order to plot a series of graphs for daily JSE Top 40 data as discussed in section 4 above. We will plot five different graphs: first is the overall portfolio wealth over time which corresponds to as described above, second, the cumulative profit and losses over time , third, the relative population wealth of experts corresponds to the wealth accumulated over time by each of the experts competing for wealth in the algorithm and finally, the relative population wealth of the strategies which takes the mean over all experts for each given trading strategy to get an accumulated wealth path for each technical trading rule.

For the purpose of testing the learning algorithm, we will identify the 15 most liquid stocks over one year prior to the start of active trading. The stocks, ranked by liquidity, are as follows: FSRJ.J, OMLJ.J, CFRJ.J, MTNJ.J, SLMJ.J, NTCJ.J, BILJ.J, SBKJ.J, WHLJ.J, AGLJ.J, SOLJ.J, GRTJ.J, INPJ.J, MNDJ.J and RMHJ.J.

No Transaction Costs

Figure 1: Overall cumulative portfolio wealth () for daily data with no transaction costs (blue) and the benchmark BCRP strategy (orange). The figure inset illustrate the associated profits and losses (PL) of the strategy.
Figure 2: figure 2 illustrates the expert wealth (Sh) for all experts for daily data with no transaction costs. figure 2 illustrates the mean expert wealth of all experts for each trading strategy () for daily data with no transaction costs.

Barring transaction costs, it’s clear that the portfolio makes favourable cumulative returns on equity over the six-year period as is evident in figure 1. The performance of the online learning algorithm (blue) is similar to that of the benchmark BCRP strategy (orange) which is promising as the original literature proves that the algorithm should track such a benchmark in the long-run. The figure inset in figure 1 illustrates that the overall strategy provides consistent positive trading profits over the entire trading horizon. figure 2 shows the expert wealth for all experts and figure 2 shows the mean expert wealth for each strategy. These figures show that on average, the underlying experts perform fairly poorly compared to the overall strategy however there is evidence that some experts make satisfactory returns over the period.

table 2 and table 3 provide the group summary statistics of the terminal wealth’s of experts and of the expert’s profits and losses over the entire trading horizon respectively where experts are grouped based on their underlying strategy . The online Z-Anticor29 algorithm produces the best expert (maximum terminal wealth) followed closely by the slow stochastic rule while Z-Anticor also produces experts with the greatest mean terminal wealth over all experts (column 2). Additionally, Z-Anticor produces expert’s with wealth’s that vary the most (highest standard deviation). Williams %R produces the worst expert by quite a long way (minimum terminal wealth). The trading rule with the lowest mean terminal wealth and worst mean ranking are SAR and slow stochastic respectively. With regards to the expert’s profits and losses (table 3), the momentum rule (MOM) produces the expert with the greatest profit in a single period. SAR followed by Anti-Z-BCRP produce the worst and second worst mean profit/loss per trading period respectively whereas Z-Anticor and Z-BCRP achieve the best mean profit/loss per trading period.

Strategy Mean (mean rank) St. Dev. Min Max
EMA X-over 0.8739 (673.6343) 0.1767 0.5216 1.4493
Ichimoku Kijun Sen 0.9508 (623.3194) 0.2313 0.5424 1.5427
MACD 0.9504 (657.7639) 0.1750 0.5601 1.6065
Moving Ave X-over 0.8895 (632.6944) 0.1930 0.5206 1.4505
ACC 1.0994 (736.5833) 0.3131 0.5283 1.9921
BOLL 1.0499 (569.1944) 0.3536 0.6076 1.7746
Fast Stochastic 0.9995 (778.6111) 0.3699 0.6006 1.8555
MARSI 1.0723 (639.3611) 0.2081 0.6947 1.6917
MOM 1.0403 (681.4444) 0.1353 0.7349 1.3595
Online Anti-Z-BCRP 0.7579 (731.9444) 0.1935 0.4649 1.0924
Online Z-Anticor 1.3155 (694.5278) 0.4388 0.6363 2.3886
Online Z-BCRP 1.2818 (652.8611) 0.2637 0.8561 1.8341
PROC 0.8963 (718.0833) 0.1631 0.6305 1.2161
RSI 1.1339 (757.3889) 0.2544 0.6440 1.7059
SAR 0.7314 (654.1111) 0.0619 0.6683 0.8683
Slow Stochastic 1.1135 (793.2222) 0.3302 0.6955 2.1023
Williams %R 0.9416 (728.6944) 0.3150 0.4662 1.5131
Table 2: Group summary statistics of the overall rankings of experts grouped by their underlying strategy ( where ) for the daily trading. In brackets next to mean are the mean overall ranking of experts within the group of their underlying strategy.
Strategy Mean St. Dev. Min Max
EMA X-over -0.00010 0.00633 -0.09745 0.08074
Ichimoku Kijun Sen -0.00004 0.00723 -0.10467 0.06157
MACD -0.00003 0.00725 -0.15993 0.08074
Moving Ave X-over -0.00009 0.00644 -0.15993 0.11482
ACC 0.00007 0.00760 -0.15993 0.08028
BOLL 0.00002 0.00711 -0.06457 0.06480
Fast Stochastic -0.00001 0.00847 -0.06469 0.06279
MARSI 0.00006 0.00612 -0.06788 0.06527
MOM 0.00004 0.00603 -0.06051 0.15820
Online Anti-Z-BCRP -0.00022 0.00773 -0.09847 0.09336
Online Z-Anticor 0.00021 0.00759 -0.06475 0.09773
Online Z-BCRP 0.00021 0.00771 -0.09336 0.09847
PROC -0.00007 0.00733 -0.10467 0.09745
RSI 0.00010 0.00666 -0.06460 0.09745
SAR -0.00023 0.00724 -0.10467 0.08724
Slow Stochastic 0.00009 0.00809 -0.06480 0.06820
Williams %R -0.00006 0.00815 -0.06820 0.06317
Table 3: Group summary statistics of the expert’s profits and losses per period grouped by their underlying strategy ( where ).

figure 3 illustrates the 2-D plot of the latent space of a Variational Autoencoder (VAE) for the time series’ of wealth’s of all the experts with experts coloured by object cluster. It is not surprising that the expert’s wealth time series’ show quite well-defined clusters in terms of the stock which experts trade in their portfolio as the stocks that each expert trades will be directly related to the decisions they make given the incoming data and hence the corresponding returns (wealth) they achieve.

Figure 3: figure 3 and figure 3 show the latent space of Variational Autoencoder on the time series’ of expert wealth’s implemented using Keras in Python. In figure 3 experts are coloured by which of the 4 object clusters they trade whereas in figure 3, experts are coloured by their underlying trading strategy .

To provide some sort of comparison, in figure 3 we plot the same results as above but this time we colour the experts in terms of their underlying strategy . The VAE seems to be able to pick up much clearer similarities (dissimilarities) between the experts based on the stocks they trade compared to which strategy they utilise providing evidence that the achieved wealth has a much stronger dependence on the stock choice rather than the chosen strategy. This may be an important point to consider and gives an indication that it may be worth considering more sophisticated ways to choose the stocks to trade rather than developing more sophisticated/profitable strategies. A discussion on the features that should be considered by a quantitative investment manager in assessing an assets usefulness is provided in Samo and Hendricks (2018).

Next, we implement the CM test for statistical arbitrage on the daily cumulative profits and losses (PL) for the strategy without transaction costs. In order to have a result that is synonymous with Jarrow et al. (2012), we choose a period of 400 days to test our strategy. We test the realised profits and losses for the 400-day period stretching from the trading day until the trading day. This is to allow for the algorithm to initiate and leave enough time for majority of the experts to have sufficient data to begin making trading decisions. Having simulated the 5000 different Min- statistics as in section 3.1 step 5a using simulations of the profit process in equation 17, figure 4 illustrates the histogram of Min- values. The critical value is then computed as the 0.95-quantile of the simulated distribution which refers to a significance level of and is illustrated by the red vertical line. The resulting critical value is . The Min- resulting from the realised incremental profits and losses of the overall strategy is 3.0183 (vertical green line). By equation 22, we recover a p-value of zero. Thus, we can conclude that there is significant evidence to reject the null of no statistical arbitrage at the 5% significance level.

Figure 4: Histogram of the 5000 simulated Min- statistics resulting from the CM test implemented on the simulated incremental process given in equation 17 along with the Min- statistic (green) for the overall strategy’s profit and loss sequence over the 400-day period stretching from the trading day until the trading day without any account for transactions costs. The figure inset displays the probability of loss for each of the first 30 trading days where we compute the probability of loss of the profit and loss process from the first trading period up to the period for each .

In addition to testing for statistical arbitrage, we also report the number of days it takes for the probability of loss of the strategy to decline below 5% using equation 22 adjusted for the case of the CM model. As discussed in section 3.1 step 7, for each , we perform MLE for to get the parameter estimates. We then substitute these estimates into equation 22 to get an estimate of the probability of loss for the period. This is all done in terms of the CM model. The figure inset of figure 4 illustrates the probability of loss for each of the first 25 trading days, where we compute the probability of loss of the profit and loss process from the first trading period up until the period for each . As is evident from the figure inset, it takes roughly 10 periods for the probability of loss to converge below 5%.

Transaction Costs

In this section we reproduce the results from above but this time including transaction costs for daily trading as discussed in section 2.5. Once direct and indirect (section 2.5) costs have been computed, the idea is to subtract off the transaction cost from the profit and losses of each day and compound the resulting value onto to get the wealth for period . These daily profit and losses are added to get the cumulative profit and loss PL.

It is clear from the inset of figure 5(a), which illustrates the profits and losses (PL) of the overall strategy less the transaction costs for each period, that consistent losses are incurred when transaction costs are incorporated. Furthermore, there is no evidence to reject the no statistical arbitrage null hypothesis as the Min- statistic resulting from the overall strategy is well below the critical value at the percentile of the histogram as illustrated in figure 5(b). In addition to this, although the probability of loss of the strategy with transaction costs included initially converges to zero, it eventually settles on a value of one. This is illustrated in the inset of figure 5(b).

(a) Overall cumulative portfolio wealth () for daily data with transaction costs. The figure inset illustrates the profits and losses (PL) for overall strategy for daily data with transaction costs.
(b) Histogram of the 5000 simulated Min- statistics resulting from the CM model and the incremental process given in equation 17 along with the Min- statistic (green) for the overall strategy’s profit and loss sequence over the 400 day period stretching from the trading day until the trading day with transactions costs incorporated. Also illustrated is the critical value at the 5% significance level (red). The figure inset shows the probability of the overall trading strategy generating a loss for each of the first 400 trading days.
Figure 5: The performance of the algorithm (figure 5(a)) and the results of the statistical arbitrage test (figure 5(b)) on daily data witth transaction costs incorporated.

Considering the above evidence contained in figure 5(a), figure 5(b) and its associated figure inset, the overall strategy does not survive historical back tests in terms of profitability when transaction costs are considered and may not be well suited for an investor utilising daily data whom has a limited time to make adequate profits. This is in agreement with Schulmeister (2009) in that there is a strong possibility that stock price and volume trends have shifted to higher frequencies than the daily time scale, and resultantly, trading strategies’ profits have, over time, diminished on such time scales.

5.2 Intraday-Daily Data

Below we report the results of the algorithm implementation for a combination of intraday and daily JSE data as discussed in section 2.3. We run the algorithm on the OHLCV data of 15 most liquid stocks from a set of 30 of the JSE Top 40. Liquidity is calculated in terms of average daily trade volume for the first 4 days of the period 02-01-2018 to 09-03-2018. The set of 15 stocks is as follows: FSR:SJ, GRT:SJ, SLM:SJ, BGA:SJ, SBK:SJ, WHL:SJ, CFR:SJ, MTN:SJ, DSY:SJ, IMP:SJ, APN:SJ, RMH:SJ, AGL:SJ, VOD:SJ and BIL:SJ. The remaining 40 days’ data for the aforementioned period is utilised to run the learning algorithm on. As in the daily data implementation, we again analyse the two cases of trading, with and without transaction costs, which we report in the following two subsections below.

No Transaction Costs

Without transaction costs, the cumulative wealth achieved by the overall strategy, illustrated in figure 6(a) evolves similarly to an exponential function over time. The associated profits and losses are displayed in the figure inset of figure 6(a). Incremental profits and losses are obviously a lot smaller compared to the daily data cases resulting in a much smoother function in comparison to the daily data case (figure 1).

table 4 is the intraday-daily analogue of table 2. In this case, the exponential moving crossover strategy (EMA X-over) produces the expert with the greatest wealth and acceleration (ACC) the expert with the least terminal wealth. Exponential moving crossover also produces experts with the highest variation in terminal wealth’s. Price rate of change (PROC) comfortably provides the best mean ranking experts among all experts among all other strategies, however, Z-BCRP produces experts with highest mean terminal wealth.

Again, as for the daily data case, we implement a test for statistical arbitrage for intraday-daily trading without transaction costs for 400 trading periods starting from the time bar of the 2nd trading day30 using the intraday-daily profit and loss sequence (PL). figure 6(b) illustrates the histogram of simulated Min- values with the 0.95-percentile of the simulated distribution representing the critical value (red) and the Min- (green) resulting from the incremental profits and losses of the overall strategy resulting from the learning algorithm. The resulting critical value is 0.7234 and the Min- value is 4.2052. Thus, there is strong evidence to reject the null hypothesis of no statistical arbitrage as the resulting p-value is identical to zero.

Strategy Mean (mean rank) St. Dev. Min Max
EMA X-over 1.0024 (662.7639) 0.0094 0.9801 1.0375
Ichimoku Kijun Sen 0.9989 (710.3750) 0.0085 0.9663 1.0303
MACD 0.9995 (684.8704) 0.0067 0.9720 1.0202
Moving Ave X-over 1.0012 (708.7824) 0.0058 0.9766 1.0204
ACC 0.9953 (831.3333) 0.0079 0.9646 1.0048
BOLL 0.9974 (712.9722) 0.0069 0.9787 1.0089
Fast Stochastic 0.9991 (711.4167) 0.0040 0.9871 1.0085
MARSI 0.9973 (736.2500) 0.0062 0.9824 1.0094
MOM 0.9982 (723.1389) 0.0087 0.9700 1.0082
Online Anti-Z-BCRP 0.9980 (597.3056) 0.0062 0.9828 1.0103
Online Z-Anticor 1.0015 (655.7778) 0.0058 0.9896 1.0180
Online Z-BCRP 1.0031 (566.5833) 0.0069 0.9898 1.0149
PROC 0.9980 (445.1389) 0.0064 0.9814 1.0140
RSI 0.9997 (535.5833) 0.0065 0.9861 1.0171
SAR 0.9945 (499.7222) 0.0053 0.9790 1.0005
Slow Stochastic 1.0007 (508.5278) 0.0048 0.9927 1.0173
Williams %R 1.0020 (536) 0.0034 0.9957 1.0133
Table 4: Group summary statistics of the overall rankings of experts grouped by their underlying strategy ( where ) for intraday-daily trading. In brackets are the mean overall ranking of experts utilising each strategy.

The figure inset of figure 6(b) illustrates the probability of loss for each of the first 25 periods of the 400 periods as discussed in the above paragraph. It takes roughly an hour (13 periods) for the probability of loss to converge to zero.

(a) The overall cumulative portfolio wealth (S) for intraday-daily data with no transaction costs. The figure inset illustrates the associated profits and losses.
(b) Histogram of the 5000 simulated Min- statistics resulting from the CM model and the incremental process given in equation 17 for the first 400 trading periods for intraday-daily profits and losses without taking into account transaction costs along with the Min- statistic for the overall strategy (green) and the critical value at the 5% significance level (red). The figure inset shows the probability of the overall trading strategy generating a loss after periods for each of the intraday-daily profit and loss process (PL) taken from the time bar of the second day when active trading commences.
Figure 6: The performance of the algorithm (figure 6(a)) and the results of the statistical arbitrage test (figure 6(b)) on intraday-daily data without any account for transaction costs.

Transaction Costs

We now report the results of the algorithm run on the same intraday-daily data as in the subsection above but this time with transaction costs incorporated (see section 2.5). figure 7(a) and the figure inset illustrate the overall cumulative portfolio wealth (S) and profits and losses (PL) respectively for intraday-daily trading with transaction costs. For comparative reasons, the axes are set to be equivalent to those illustrated in the case of no transaction costs (figure 7(a) and the figure inset). Surprisingly, even with a total daily trading cost (direct and indirect) of roughly 130bps, which is a fairly conservative approach, the algorithm is able to make satisfactory returns, which is in contrast to the daily trading case (figure 5(a)). Furthermore, figure 7(b) provides significant evidence to reject the no statistical arbitrage null hypothesis and returns a Min- statistic almost identical (4.32 in the transaction costs case compared to 3.87) to that of the case of no transaction costs (figure 6(b)). Even more comforting, is the fact that even when transaction costs are considered, the probability of loss per trading period converges to zero, albeit slightly slower (roughly 2 hours or 31 trading periods) than the case of no transaction costs (roughly 1 hour or 13 trading periods, as illustrated in the inset of figure 6(b)).

The above results for intraday-daily trading are in complete contrast to the case of daily trading with transaction costs, whereby the no statistical arbitrage null could not be rejected, the probability of loss did not converge to zero and remain there, and trading profits steadily declined over the trading horizon. This suggests that the proposed algorithm may be much better suited to trading at higher frequencies. This is not surprising and is in complete agreement with Schulmeister (2009) who argues that the profitability of technical trading strategies had declined over from 1960, before becoming unprofitable from the 1990’s. A substantial set of technical trading strategies are then implemented on 30-minute data and the evidence suggests that such strategies returned adequate profits between 1983 and 2007 however the profits declined slightly between 2000 and 2007 compared to the 1980’s and 1990’s. This suggests that markets may have become more efficient and even the possibility that stock price and volume trends have shifted to even higher frequencies than 30 minutes (Schulmeister (2009)). This supports the choice to trade the algorithm proposed in this paper on at least 5-minute OHLCV data and reinforces our conclusion that ultimately, the most desirable implementation of the algorithm would be in volume-time, which is best suited for high frequency trading.

(a) The overall cumulative portfolio wealth (S) for intraday-daily data with transaction costs. The figure inset illustrates the associated profits and losses.
(b) Histogram of the 5000 simulated Min- statistics resulting from the CM model and the incremental process given in equation 17 for the first 400 trading periods for intraday-daily profit and losses less transaction costs along with the Min- statistic for the overall strategy (green) and the critical value at the 5% significance level (red).
Figure 7: The performance of the algorithm (figure 7(a)) and the results of the statistical arbitrage test (figure 7(b)) on intraday-daily data with transaction costs incorporated.

6 Conclusion

We have developed a learning algorithm built from a base of technical trading strategies for the purpose of trading equities on the JSE that is able to provide favourable returns when ignoring transaction costs, under both daily and intraday trading conditions. The returns are reduced when transaction costs are considered in the daily setting, however there is sufficient evidence to suggest that the proposed algorithm is really well suited to intraday trading.

This is reinforced by the fact that there exists meaningful evidence to reject a carefully defined null hypothesis of no statistical arbitrage in the overall trading strategy even when a reasonably aggressive view is taken on intraday trading costs. We are also able to show that it in both the daily and intraday-daily data implementations that the probability of loss declines below 5% relatively quickly which strongly suggests that the algorithm is well suited for a trader whose preference or requirement is to make adequate returns in the short-run. It may well be that the statistical arbitrages we have identified intraday are artefacts’ from “price distorters” (Moffit (2017)) rather than legitimate mispricing in the sense of majority views relative to a trading minority and hence cannot be easily traded out of profit. This suggests that it can be important to try unpack the difference between the structural mispricing’s relative to statistical arbitrages—that is outside of the scope of the current work and cannot be determined using the tests implemented in this work.

The superior performance of the algorithm for intraday trading is in agreement with Schulmeister (2009), who concluded that while the daily profitability of a large set of technical trading strategies has steadily declined since 1960 and has been unprofitable since the onset of the 1990’s, trading the same strategies on 30-minute (intraday) data between 1983 and 2007 has produced decent average gross returns. However, such returns have slowly declined since the early 2000’s. In conclusion, the proposed algorithm is much better suited to trading at higher frequencies; but we are also aware that over time tradings strategies that are not structural in nature are slowly arbitraged away through over-crowding.

We are also cognisant of the fact that intraday trading will require a large component of accumulated trading profits to finance frictions, concretely to fund direct, indirect and business model costs (Loonat and Gebbie (2018)). For this reason, we are careful to remain sceptical with this class of algorithms long-run performance when trading with real money in a live trading environment for profit. The current design of the algorithm is not yet ready to be traded on live market data, however with some effort it is easily transferable to such use cases given the sequential nature of the algorithm and its inherent ability to receive and adapt to new incoming data while making appropriate trading decisions based on the new data. Concretely, the algorithm should be deployed in the context of volume-time trading rather than the calendar time context considered in this work.

Possible future work includes implementing the algorithm in volume-time, which will be best suited for dealing with a high frequency implementation of the proposed algorithm, given the intermittent nature of order-flow. We also propose replacing the learning algorithm with an online (adaptive) neural network that has the ability to predict optimal holding times of stocks. Another interesting line of work that has been considered is to model the population of trading experts as competing in a predator-prey environment (Farmer (2000); Johnson et al. (2013)). This was an initial key motivation for the research project, to find which collections of technical trading strategies can be grouped collectively and how these would interact with each other. This can include using cluster analysis to group, or separate trading experts, based on their similarities and dissimilarities, and hence make appropriate inferences regarding their interactions and behaviours at the level of collective and emergent dynamics. This can in turn be used for cluster based approaches for portfolio control.


NM and TG would like to thank the Statistical Finance Research Group in the Department of Statistical Sciences for various useful discussions relating to the work. In particular, we would like to thank Etienne Pienaar, Lionel Yelibi and Duncan Saffy. We thank Michael Gant for his help with developing some of the strategies and for numerous valuable discussions with regards to the statistical arbitrage test.


TG would like to thank UCT FRC for funding (UCT fund 459282).

Supplemental material

Please access the supplemental material at Murphy (2019).


Appendix A Technical Indicators and Trading Rules

We follow Creamer and Freund (2010); Kestner (2003) in introducing and describing some of the more popular technical analysis indicators as well as a few others that are widely available. We also provide some trading rules which use technical indicators to generate buy, sell and hold signals.

\topruleIndicator Description Calculation
\toprule The Simple Moving Average (SMA) is the mean of the closing prices over the last trading days. The smaller the value of , the closer the moving average will fit to the price data.
The Exponential Moving Average (EMA) uses today’s close price, yesterday’s moving average value and a smoothing factor (). The smoothing factor determines how quickly the exponential moving average responds to current market prices (Kestner (2003)).
HH(n) The Highest High (HH) is the greatest high price in the last periods and is determined from the vector of the high-prices of the last periods. Given the high prices of last periods:
to find:
LL(n) Lowest Low (LL) is the smallest low price in the last periods and is found from the vector of low prices in the last periods. Givne the low prices of the last periods:
to find:
The Ichimoku Kinko Hyo (IKH) (at a glance equilibrium chart) system consists of five lines and the Kumo (cloud).31 The five lines all work in concert to produce the end result. The size of the Kumo is an indication of the current market volatility, where a wider Kumo is a more volatile market. Typical input parameters: , , and . Here we keep fixed at 7 but vary the other two parameters. Tenkan-sen (Conversion Line):
Kijun-sen (Base Line):
Chikou Span (Lagging Span):
Senkou Span A (Leading Span A):
Senkou Span B (Leading Span B):
Kumo (Cloud): Area between the Leading Span A and the Leading Span B from the Cloud
Momentum (MOM) gives the change in the closing price over the past periods.
Acceleration(ACC) measures the change in momentum between two consecutive periods and
The Moving Average Convergence/Divergence (MACD) oscillator attempts to determine whether traders are accumulating stocks or distributing stocks. It is calculated by computing the difference between a short-term and a long-term moving average. A signal line is computed by taking an EMA of the MACD and determines the instances to buy (over-sold) and sell (over-bought) when used in conjunction with the MACD32. Long-term EWM:
Short-term EWM:
Moving Average Convergence/Divergence:
The “Signal Line” (SL):
The “MACD Signal” (MACDS):
    and Fast Stochastic Oscillator shows the location of the closing price relative to the high-low range, expressed as a percentage, over a given number of periods as specified by a look-back parameter.
    and The Slow Stochastic Oscillator is very similar to the fast stochastic indicator and is in fact just a moving average of the fast stochastic indicator.
Relative Strength Index (RSI) compares the periods that stock prices finish up (closing price higher than the previous period) against those periods that stock prices finish down (closing price lower than the previous period).33
where finishing up (down) are:
to find the vector of up and down finishing cases:
Moving Average Relative Strength Index (MARSI) is an indicator that smooths out the action of RSI indicator.34 MARSI is calculated by simply taking an -period SMA of the RSI indicator.
Bollinger(Boll) bands uses a SMA () as it’s reference point (known as the median band) with regards to the upper and lower Bollinger bands denoted by and respectively and are calculated as functions of standard deviations (). Median band:
Upper band:
Lower band:
Here is chosen to be 2.
The rate of change of the time series of closing prices over the last periods expressed as a percentage.
Williams Percent Range (Williams %R) is calculated similarly to the fast stochastic oscillator and shows the level of the close relative to the highest high in the last periods.
Parabolic Stop and Reverse (SAR), developed by J. Wells Wilder, is a trend indicator formed by a parabolic line made up of dots at each time step (Wilder (1978)). The dots are formed using the most recent Extreme Price and an acceleration factor (AF), 0.02, which increases each time a new Extreme Price (EP) is reached. The AF has a maximum value of 0.2 to prevent it from getting too large. Extreme Price represents the highest (lowest) value reached by the price in the current up-trend (down-trend). The acceleration factor determines where in relation to the price the parabolic line will appear by increasing by the value of the AF each time a new EP is observed and thus affects the rate of change of the Parabolic SAR. Calculating the SAR indicator:
  1. Initialise: Set initial trend to 1 (up-trend), EP to zero, to 0.02, to the closing price at time zero (), Last High (LH) to high price at time zero () and Last Low (LL) to the low price at time zero ())

  2. Level Update: update EP, LH, LL and AF based on the current high in relation to the LH (up-trend), or where the current low is in relation to the LL (down-trend)

  3. Time Update: update time SAR value, , using equation (LABEL:eq:sar) for the Parabolic SAR for time as calculated using the previous value at time :

  4. Trend Update: modify the value, AF, EP, LL, LH and the trend based on the trend and it’s value in relation to the current low and current high

  5. Iterate: go to next time period and return to step 2

Table 5: The set of trading indicators utilised by the trading rules described in Table 6 along with their descriptions and calculation details.
\topruleTrading Rule Decision Condition \topruleMoving Average Crossover