Learning the dynamics of technical trading strategies
Abstract
We use an adversarial expert based online learning algorithm to learn the optimal parameters required to maximise wealth trading zerocost portfolio strategies. The learning algorithm is used to determine the relative population dynamics of technical trading strategies that can survive historical backtesting as well as form an overall aggregated portfolio trading strategy from the set of underlying trading strategies implemented on daily and intraday Johannesburg Stock Exchange data. The resulting population timeseries are investigated using unsupervised learning for dimensionality reduction and visualisation. A key contribution is that the overall aggregated trading strategies are tested for statistical arbitrage using a novel hypothesis test proposed by Jarrow et al. (2012) on both daily sampled and intraday timescales. The (low frequency) daily sampled strategies fail the arbitrage tests after costs, while the (high frequency) intraday sampled strategies are not falsified as statistical arbitrages after costs. The estimates of trading strategy success, cost of trading and slippage are considered along with an offline benchmark portfolio algorithm for performance comparison. In addition, the algorithms generalisation error is analysed by recovering a probability of backtest overfitting estimate using a nonparametric procedure introduced by Bailey et al. (2016). The work aims to explore and better understand the interplay between different technical trading strategies from a datainformed perspective.
nline learning; technical trading; portfolio selection; statistical arbitrage; backtest overfitting; Johannesburg Stock Exchange
G11, G14 and O55
1 Introduction
Maximising wealth concurrently over multiple time periods is a difficult task; particularly when combined with capital allocations that are made simultaneously with the selection of plausible candidate trading strategies and signals. An approach to combining strategy selection with wealth maximisation is to use online or sequential machine learning algorithms (Györfi et al. (2012)). Online portfolio selection algorithms attempt to automate a sequence of trading decisions among a set of stocks with the goal of maximising returns in the long run. Here the long run can correspond to months or even years, and is dependent on the frequency at which trading takes place.
The attraction of this approach is that the investor does not need to have any knowledge about the underlying distributions that could be generating the stock prices (or even if they exist). The investor is left to “learn” the optimal portfolio to achieve maximum wealth using past data directly (Györfi et al. (2012)).
Cover (1991) introduced a “followthewinner” online investment algorithm
Here our “experts” will be similarly characterised by a portfolio, but portfolios that proxies different trading strategies. Here a particular agent makes decisions independently of all other experts. The UP algorithm holds parametrised Constant Rebalanced Portfolio (CRP) strategies as its underlying experts. We will have a more generalised approach to generating experts. The algorithm provides a method to effectively distribute wealth among all the CRP experts such that the average logperformance of the strategy approaches the Best Constant Rebalanced Portfolio (BCRP) which is the hindsight strategy chosen which gives the maximum return of all such strategies in the long run. A key innovation was the provision of a mathematical proof for this claim based on arbitrary sequences of ergodic and stationary stock return vectors Cover (1991).
If some logoptimal portfolio exists such that no other investment strategy has a greater asymptotic average growth then to achieve this one must have full knowledge of the underlying distribution and of the generating process to achieve such optimality (Algoet and Cover (1988); Cover (1991); Cover and Ordentlich (1996); Györfi et al. (2008)). Such knowledge is unlikely in the context of financial markets. However, strategies which achieve an average growth rate which asymptotically approximates that of the logoptimal strategy are possible when the underlying asset return process is sufficiently close to being stationary and ergodic. Such a strategy is called universally consistent.
Györfi et al. (2008) proposed a universally consistent portfolio strategy and provided empirical evidence of a strategy based on nearestneighbour based experts which reflects such asymptotic logoptimality. The idea is to match current price dynamics with similar historical dynamics using pattern matching. The pattern matching is implemented using a nearestneighbour search algorithm to select parameters for experts. The patternmatching algorithm was extended by Loonat and Gebbie (2018) in order to implement a zerocost portfolio i.e. a long/short and selffinancing portfolio. The algorithm was also recast to replicate nearrealtime applications using lookup libraries learnt offline. However, there is a computational cost associated with coupling and creating offline pattern libraries and the algorithms are not truly online.
A key objective in the implementation of online learning here is that the underlying experts remain online and they can be sequentially computed on a moving finite datawindow using parameters from the previous timestep. Here we have ignored the patternmatching step in the aforementioned algorithms and rather propose our own expert generating algorithm using tools from technical analysis. Concretely, we replace the patternmatching expert generating algorithm with a selection of technical trading strategies.
Technical analysis indicators are popular tools from technical analysis used to generate trading strategies (Chan (2009); Clayburg (2002)). They claim to be able to exploit statistically measurable shortterm market opportunities in stock prices and volume by studying recurring patterns in historical market data (Creamer and Freund (2010); Chande and Kroll (1994); Rechenthin (2014)). What differentiates technical analysis from traditional timeseries analysis is that it tends to place an emphasis on recurring timeseries patterns rather than invariant statistical properties of timesseries. Traditionally, technical analysis has been a visual activity, whereby traders study the patterns and trends in charts, based on price or volume data, and use these diagnostic tools in conjunction with a variety of qualitative market features and news flow to make trading decisions.
This is perhaps not dissimilar to the relationship between alchemy and chemistry, or astrology and astronomy, but in the setting of financial markets, but many studies have criticised the lack of a solid mathematical foundation for many of the proposed technical analysis indicators (Aronson (2007); Lo et al. (2000); Lo and Hasanhodzic (2009)). There has also been an abundance of academic literature, utilising technical analysis for the purpose of trading, and several studies have attempted to develop indicators and test them in a more mathematically, statistically and numerically sound manner (Aronson (2007); Kestner (2003); Park and Irwin (2004)). Much of this work is still viewed with some suspicion given that it is extremely unlikely that this or that particular strategy or approach was not the result of some sort of backtest overfitting (Hou et al. (2017); Bailey et al. (2014, 2016)).
Our work does not address the question: “Which, if any, technical analysis methods reveal useful information for trading purposes?”; rather, we aim to bag a collection of technical experts and allow them to compete in an adversarial manner using the online learning algorithm. This allows us to consider whether the resulting aggregate strategy can: 1.) pass reasonable tests for statistical arbitrage, and 2.) has a relatively low probability of being the result of backtest overfitting. Can the strategy be considered a statistical arbitrage, and can it generalise well outofsample?
Concretely, here we are concerned with the idea of understanding whether the collective population of technical experts can through time lead to dynamics that can be reasonably be considered a statistical arbitrage (Jarrow et al. (2012)), and then with a reasonably low probability of backtest overfitting (Bailey et al. (2016)). Can we generate wealth, both before costs, and then after costs, using the online aggregation of technical strategies? Then, what broad groups of strategies will emerge as being successful in the sense of positive trading profits with declining variance in losses?
Incorrectly accounting for costs will always be a plausible explanation for any apparently profitable trading strategy (see Loonat and Gebbie (2018)), but even after costs there still exists a high likelihood that there was some dataoverfitting because we only have single price path from history with little or no knowledge about the true probability of the particular path that has been measured. While the adaptive nature of markets themselves are continually changing the efficacy of various strategies and approaches.
Rather than considering various debates relating the technicalities of market efficiency, where one is concerned with the expeditiousness of market prices to incorporate new information at any time, and where information is explicitly exogenous; we restrict ourselves to market efficiency in the sense used by Fischer Black (Black (1986); Bouchaud et al. (2017); Aronson (2007)). This is the situation where some of the shortterm information is in fact noise, and that this type of noise is a fundamental property of real markets. Although market efficiency may plausibly hold over the longer term, in the shortterm there may be small departures that are amenable to tests for statistical arbitrage (Jarrow et al. (2012)), departures that create incentives to trade, and more importantly, departures that may not be easily traded out of the market due to various asymmetries in costs, market structure and market access.
In order to analyse whether the overall backtested strategy depicts a candidate statistical arbitrage, we implement a test first proposed by Hogan et al. (2004) and further refined by Jarrow et al. (2012). Hogan et al. (2004) provide a plausible technical definition of statistical arbitrage based on a vanishing probability of loss and variance in the trading profits, and then use this to propose a test for statistical arbitrage using a Bonferroni test (Hogan et al. (2004)). This methodology was extended and generalised by Jarrow et al. (2012) to account for the asymmetry between desirable positive deviations (profits) and undesirable negative deviations (losses), by including a semivariance hypothesis instead of the originally constructed variance hypothesis, which does not condition on negative incremental deviations. The socalled Min statistic is computed, and used in conjunction with a Monte Carlo procedure, to make inferences regarding a carefully defined “no statistical arbitrage” null hypothesis.
This is analogous to evaluating market efficiency in the sense of the Noisy efficient market hypothesis (Black (1986)) whereby a failure to reject the no statistical arbitrage null hypothesis will result in concluding that the market is in fact sufficiently efficient and no persistent anomalies can be consistently exploited by trading strategies over the long term. Traders will always be inclined to employ strategies which depict a statistical arbitrage and especially strategies which have a probability of loss that declines to zero quickly as such traders will often have limited capital and short horizons over which they must provide satisfactory returns (profits) (Jarrow et al. (2012)).
We make the effort here to be very clear that we do not attempt to identify profitable (technical) trading strategies nor to make any claims about the informational value of technical analysis, but rather we will generate a large population of strategies, or experts, constructed from various technical trading rules, and combinations of the associated parameters of these rules, in the attempt to learn something about the aggregate profitability of the population dynamics of the set of experts.
Expert’s will generate trading signals, i.e. buy, sell or hold decisions for each stock held in their portfolio as based on the underlying parameters and the necessary historic data implied by the parameter’s. Once trading signals for the current time period have been generated by a given expert, a methodology to transform the signals into a set of portfolio weights, or controls, is required.
We introduce a transformation method that computes controls proportional to the relative volatilities of the stocks for which nonzero trading signals were generated, and then normalise the resulting values such that the selffinancing and leverage constraints required by the algorithm are satisfied. The resulting controls are then utilised to compute the corresponding expert wealth’s. The experts who accumulate the greatest wealth during a trading period will receive more wealth in the following trading period and thus contribute more to the final aggregated portfolio. This can be best thought of as some sort of “fundoffunds” over the underlying collection of trading strategies.
This is a metaexpert that aggregates experts that represent all the individual technical trading rules. The overall metaexpert strategy performance is achieved by the online learning algorithm. We explicitly provide equity curves for the individual expert’s portfolios, given as the accumulated trading profit through time, along with performance curves for the overall strategy’s wealth and the associated profits and losses.
We perform a backtest of the algorithm on two different data sets over two separate time periods: 1.) one using daily data over a sixyear period, and 2.) the other using a mixture of intraday and daily data over a twomonth period. A selection of the fifteen most liquid stocks which constitute the Johannesburg Stock Exchange (JSE) Top 40 shares is utilised for the two separate implementations
The overall strategy performance is compared to the BCRP strategy to form a benchmark comparison to evaluate the success of our strategy. The overall strategy is then tested for statistical arbitrage to find that in both a daily, and intradaydaily data implementation, the strategy depicts a statistical arbitrage before costs.
A key point here, as in Loonat and Gebbie (2018), is that it does seem that plausible statistical arbitrages are detected on the mesoscale and in the shortterm. However, after accounting for reasonable costs, only the short term trading strategies seem to pass statistical arbitrage tests. This does not imply profitability as these remaining strategies my be structural and hence may not be easily profitably traded out of the system.
Finally, we analyse the generalisation error of the overall strategy to get a sense of whether or not the strategy conveys backtest overfitting, by estimating the probability of backtest overfitting (PBO) inherent in multiple simulations of the algorithm on subsets of historic data.
The paper will proceed as follows: section 2 explains the construction of the algorithm including details of how the experts are generated, how their corresponding trading signals are transformed into portfolio weights and a stepbystep breakdown of the learning algorithm. In section 3, we introduce the concept of a statistical arbitrage, including the methodology for implementing a statistical arbitrage test, calculating the probability of loss and estimating the PBO for a trading strategy. All experiment results and analyses of implementations of the algorithm are presented in section 5. section 6 states all final conclusions from the experiments and possible future work.
In summary, we are able to show that on a daily sampled timescale there is most likely little value in the aggregate trading of the technical strategies of the type considered here. However, on intraday timesscale things look slightly different. Even with reasonable costs accounted for there still seems to be the possibility that price based technical trading cannot be ruled out as offering an avenue for statistical arbitrage. However, considerable care is still required to ensure that one is accounting for the full complexity of market realities that can often make it practically difficult to arbitrage these sorts of apparent trading opportunities out of the market as they may be the results of topdown structure and orderflow itself, rather than some notion of information inefficiency, but are rather the signature of noise trading.
2 Learning Technical Trading
Rather than using a backtest in the attempt to find the single most profitable strategy, we produce a large population of trading strategies – ”experts” – and use an adaptive algorithm to aggregate the performances of the experts to arrive at a final portfolio to be traded. The idea of the online learning algorithm is to consider a population of experts created using a large set of technical trading strategies generated from a variety of parameters and to form an aggregated portfolio of stocks to be traded by considering the wealth performance of the expert population.
During each trading period, experts trade and execute either a buy (1), sell (1) or hold (0) actions. These actions are independent of one another and based on each of the individual experts strategies. The trade signals, , are transformed into a set of portfolio weights such that their sum is identical to zero; this ensures that the strategy can be selffunding. We also require that the portfolio is unit leveraged, and hence the absolute sum of controls is equal to one. This is to avoid having to introduce a margin account into the trading mechanics.
Based on each individual expert’s accumulated wealth up until some time , a final aggregate portfolio for the next period is formed by creating a performance weighted combination of the experts. Experts who perform better in period will have a larger relative contribution toward the aggregated portfolio to be implemented in period than those who perform poorly. Below, we describe the methodology for generating the expert population.
2.1 Expert Generating Algorithm
Technical trading
Technical trading refers to the practice of using trading rules derived from technical analysis indicators to generate trading signals. Here, indicators refer to mathematical formulas based on OpenHighLowClose (OHLC) price bars, volume traded or a combination of both (OHLCV). An abundance of technical indicators and associated trading rules have been developed over the years with mixed success. Indicators perform differently under different market conditions and different human operators, which is why traders will often use multiple indicators to confirm the signal that one indicator gives on a stock with another indicators signal. Thus, in practice and various studies in the literature, many trading rules generated from indicators are typically back tested on a significant amount (typically thousands of data points) of historical data to find the rules that perform the best
In order to produce the broad population of experts, we consider combinations among a set of four model parameters. The first of these parameters is the underlying strategy of a given expert, , which corresponds to the set of technical trading and trendfollowing strategies where the total number of different trading rules is denoted by . Each of the rules require at most two parameters each time a buy, sell or hold signal is computed at some time period . The two parameters represent the number of short and longterm lookback periods necessary for the indicators used in the rules. These parameters will determine the amount of historic data considered in the computation of each rule. We will denote the vector of shortterm parameters by and the longterm parameters by which make up two of the four model parameters. Let and
The final model parameter, denoted by c, refers to object clusters where is the object cluster and is the number of object clusters. We will consider four object clusters (); the trivial cluster which contains all the stocks and the three major sector clusters of stocks on the JSE, namely, Resources, Industrials and Financials.
The algorithm will loop over all combinations of these four model parameters calling the appropriate strategies, , stocks, c, and amount of historic data, and , to create a: buy, sell or hold signal at each time period . Each combination of for , for , for and for will represent an expert. It should be clear that some experts may trade all the stocks, i.e use the trivial clusters, and others will trade subsets of the stocks, i.e resources, industrials and financials. It is also important to note that for rules requiring two parameters, the loop over the longterm parameters will only activate at indices for which where and represent the loop index over the short and longterm parameters respectively. The total number of experts, , is then given by
We will denote each expert’s strategy
(1) 
where is the volume of the stock at period . The stocks with the largest ADV will then be fed into the algorithm for trading.
Transforming signals into weights
In this section, we describe how each individual expert’s set of trading signals at each time period are transformed into a corresponding set of portfolio weights (controls) which constitute the expert’s strategy (). For the purpose of generality, we refer to the stocks traded by a given expert as even though the weights of many of these stocks will be zero for multiple periods as the expert will only be considering a subset of these stocks depending on which object cluster the expert trades.
Suppose it is currently time period and the expert is trading stocks. Given that there are stocks in the portfolio, trading signals will need to be produced at each trading period. The riskfree assets purpose will be solely to balance the portfolio given the set of trading signals. Given the signals for the current time period and previous period , all hold signals at time are replaced with the corresponding nonzero signals from time as the expert retains his position in these stocks

All output signals are hold (0)

All output signals are nonnegative (0 or 1)

All output signals are nonpositive (0 or 1)

There are combinations of buy, sell and hold signals (0, 1 and 1) in the set of output signals
Due to the fact that cases II (longonly) and III (shortonly) exist, we need to include a riskfree asset in the portfolio so that we can enforce the selffinancing constraint; the controls must sum to zero . We refer to such portfolios as zerocost portfolios. Additionally, we implement a leverage constraint by ensuring that the absolute value of the controls sum to unity: .
For case I, we set all stock weights and the riskfree asset weight to zero so that the expert does not allocate any capital in this case since the output signals are all zero.
For the remaining cases we need to find asset weights that satisfy the investment constraints. From the asset class standard deviations we can define buy (and sell) weight allocation to stocks with positive signals (negative signal) to give by longasset (shortasset) weights:
(2) 
We will use these weights based on the signals to generate trading positions.
For case II, we compute the standard deviations of the set of stocks which resulted in buy (positive) signals from the output signals using their closing prices over the last 90 days for daily trading and the last 90 trading periods for intradaydaily trading
The number of buy signals, from the set of output signals, can be denoted by and the vector of standard deviations of stocks with nonzero output signals is given by . Then the weight allocated to stocks with positive signals is given by positive signal form of equation (equation 2). Here the lowest value of corresponds to the least volatile stock and vice versa for large . This equation ensures that . We then short the riskfree asset with a weight of one half . This allows us to borrow using the riskfree asset and purchase the corresponding stocks within which we take a long position.
Case III is similar to Case II above, however instead of having positive output signals, all output signals are negative. Again, we compute standard deviations of the set of stocks which resulted in sell (negative) signals from the output signals using their closing prices over the last 120 trading periods. Let the number of sell signals from the set of output signals be denoted by and denote the vector of standard deviations of stocks to be sold by . Then the weight allocated to stocks which have short positions is given by the negative signal form of equation (equation 2). We then take a long position in the riskfree asset with a weight of one half .
For case IV, we use the similar methodology to that discussed above in Case II and III. To compute the weights for the short assets we use the negative signal formula in equation (equation 2); similarly, for the long assets we use positive signal formula from equation (equation 2). We then set the riskfree rate to be equal to in order to enforce the selffinancing and fully invested constraints. Finally, assets which had hold signals have their weights are set to zero.
The method described above is what we will refer to as the volatility loading method for transforming signals into controls. A second method is considered, called the inverse volatility loading method, and is defined similarly to the method described above, however, instead of multiplying through by the volatility vector in each of the above cases, we multiply through by the inverse of the volatility vector (elementwise inverses). We will not implement the inverse volatility loading method in this study as the results of the two methods are similar.
2.2 Online Learning Algorithm
Given that we now have a population of experts, each with their own controls, , we implement the online learning algorithm to aggregate the expert’s strategies at time based on their performance and form a final single portfolio to be used in the following period which we denote . The aggregation scheme used is inspired by the Universal Portfolio (UP) strategy taken from the work done by Cover (1991); Cover and Ordentlich (1996) and a modified version proposed by Györfi et al. (2008). Although, due to the fact that we have several different base experts as defined by the different trading strategies rather than Cover’s (see Cover (1991)) constant rebalanced UP strategy, our algorithm is better defined as a metalearning algorithm (Li and Hoi (2016)). We use the subscript since the portfolio is created using information only available at time even though the portfolio is implemented in the following time period. The algorithm will run from the initial time
The relatively simplistic learning algorithm is incrementally implemented online but offline it can be parallelised across experts. Given the expert controls from the Expert Generating Algorithm (), the online learning algorithm is implemented by carrying out the steps (Loonat and Gebbie (2018)):

Update portfolio wealth: Given the portfolio control for the asset at time , we update the portfolio wealth for the period
(3) (4) represents the compounded cumulative wealth of the overall aggregate portfolio and will denote the corresponding vector of aggregate portfolio wealth’s over time. Here the realised price relatives for the period and the asset, , are combined with the portfolio controls for the previous period to obtain the realised portfolio returns for the current period . is in fact the profits and losses for the current trading period . Thus, we will use it to update the algorithms overall cumulative profits and losses which is given by
(5) 
Update expert wealth: The expert controls were determined at the end of timeperiod for time period by the expert generating algorithm for experts and objects about which the experts make expert capital allocation decisions. At the end of the time period the performance of each expert , , can be computed from the change in the price relatives for the each of the objects in the investment universe considered using the closing prices at the start, , and the end of the time increment, , using the expert controls.
(6) (7) 
Update expert mixtures: We consider a UP inspired expert mixture update rule as follows (Cover (1991); Györfi et al. (2008); Loonat and Gebbie (2018)): the mixture of the expert for the next time increment, , is equivalent to the accumulated expert wealth up until time and will be used as the update feature for the next unrealised increment subsequent appropriate normalisation
(8) 
Renormalise expert mixtures: As mentioned previously, we will consider experts such that the leverage is set to unity for zerocost portfolios: 1.) and 2.) . We will not consider the longonly experts (absolute experts as in Loonat and Gebbie (2018)), but only consider experts whom satisfy the prior two conditions which we will refer to as active experts. This in fact allows for shorting of one expert against another; then due to the nature of the mixture controls, the resulting portfolio becomes selffunding.
(9) 
Update portfolio controls: The portfolio controls are updated at the end of time period for time period using the expert mixture controls from the updated learning algorithm and the vector of expert controls for each expert from the expert generating algorithms using information from time period . We then take a weighted average over all experts by taking the sum with respect to
(10)
The strategy is to implement the portfolio controls, wait until the end of the current time increment, measure the features (OHLCV values), update the experts and then reapply the learning algorithm to compute the expert mixtures and portfolio controls for the next time increment.
2.3 Algorithm implementation for intradaydaily trading
Intraday trading poses a whole new set of new issues that need to be considered. The first relates to the how the actual trading occurs. The second how to deal with spurious data. It is not as straightforward as substituting downsampled transaction data into an algorithmâs built and tested on uniformly sampled daily data. Uniformly sampled daily closing auction data is not equivalent to uniformly sampled intraday bardata. The endofday price discovery process is entirely different to that found intraday; the prior is a closing auction, the latter has prices being the result of continuoustime trading in a double auction. In addition to this, the first and last datapoints of each day lead to overnight gap effects that will lead to spurious signals if not aggregated correctly over multiple days. Rather than dealing with the full complexity of a money management strategy, the main issue that we are concerned with will be the overnight gap effect which relates to the deviation in the prices at the end of day and the start of day . We implement the learning algorithm on a combination of daily and intraday data, whereby decisions made on the daily time scale are made completely independent of those made on the intraday time scale but the dynamics of the associated wealth’s generated by the processes are aggregated. We will refer to trading using a combination of daily and intraday data as intradaydaily trading.
The best way to think about it is to consider the experts as trading throughout the day, making decisions based solely on intraday data while compounding their wealth, and once a trading decision is made at the final time bar, the expert makes one last trading decision on that day based on daily historic OHLCV data, where the lookback periods will be based on passed trading days and not on any time bars for that day. The daily trading decision can be thought of as representing the last time bar of the day, where we are just using different data to make the decision. The methodology for each of the intraday and daily trading mechanisms are almost exactly as explained in section 2.2 above, however, there are necessary alterations to the algorithm. As in the daily data implementation, a given expert will begin making trading decisions as soon as there is a sufficient amount of data available to them. Here, we begin the algorithm from day two so that there is sufficient data to compute at least one return on the daily time scale. We then loop over the intraday time bars from 9:15am to 4:30pm on each given day.
To introduce some notation for intradaydaily trading, let
At the beginning of day , the expert wealth
2.4 Online Portfolio Benchmark Algorithm
To get an idea of how well our online algorithm performs, we compare its performance to that of the offline BCRP. As mentioned previously, the hindsight CRP strategy chosen which gives the maximum return of all such strategies in the long run. To find the portfolio controls of such a strategy, we perform a brute force Monte Carlo approach to generate 5000 random CRP strategies on the entire history of price relatives and choose the BCRP strategy to be the one that returns the maximal terminal portfolio wealth. As a note, here, the CRP strategies we consider are longonly.
2.5 Transaction Costs and Market Frictions
Apart from the (direct) transaction fees (commissions) charged by exchanges for the trading of stocks, there are various other costs (indirect) that need to be considered when trading such assets. Each time a stock is bought or sold there are unavoidable costs and it is imperative that a trader takes into account these costs. The other three most important components of these transaction costs, besides commissions charged by exchanges, are the spread
To estimate indirect transaction costs (TC) for each period , we will consider is the squareroot formula (Gatheral (2010))
where:

Volatility of the returns of a stock (): See section 2.5.1 below.

Average daily volume of the stock (ADV): ADV is computed using the previous 90 days trading volumes for daily trading and the previous 5 days intraday trading volume for intradaydaily trading.

Number of shares traded (): The number of shares traded () is taken to be 1bp of ADV for each stock per day for daily trading. The number of stocks traded for intradaydaily trading is assumed to be 70bps of ADV for the entire portfolio per day which is then split evenly among all active trading periods during the day, to get 70bps/85 (assuming 88 5minute time bars per day) of ADV per stock, per trading period.

Spread: Spread is assumed to be 1bps per day (1%% /pd) for daily trading. For intradaydaily trading, we assume 20bps per day, which we then split evenly over the day to incur a cost of 0.002/85 per time bar.

: Number of times the trading signal changes from a buy (sell) to a sell (buy,) for all stocks in the portfolio, over consecutive trading periods.
^{19}
The use of the squareroot rule in practice dates back many years and is often used as a pretrade transaction cost estimate (Gatheral (2010)). The first term in section 2.5 can be regarded as the term representing the slippage
Each trade (say an entry) is consider a child order of the trading strategy across the period. This is not really a faithful representation but was chosen in order to optimise the historic simulation. It should be realised that each childorder is not of a fixed size, as this is determined by the algorithm. However, at the end of the day we have shares traded, and entry and exit pairs.
In addition to the indirect costs associated with slippage and price impact as accounted for by the squareroot formula, we include direct costs such as the borrowing of trading capital, the cost of regulatory capital and the various fees associated with trading on the JSE (Loonat and Gebbie (2018)). Such costs will also account for small fees incurred in incidences where shortselling has taken place. For the daily data implementation, we assume a total direct cost of 4bps per day. This assumption is purely made to approximately match the total daily transaction cost assumption made by Loonat and Gebbie (2018). For the intradaydaily implementation a total direct cost of 70bps per day is assumed (following Loonat and Gebbie (2018)) which we then split evenly over each day’s active trading periods (85 time bars since first expert only starts trading after the time bar) to get a cost of 70bps/85 per period. These costs are indicative and actual price impact requires either real trading experiments or fullscale market simulation  both are intractable in the context of our approach.
For daily trading, we recover an average daily transaction cost of roughly 17.75bps which is almost double the 10bps assumed by Loonat and Gebbie (2018). Loonat and Gebbie argue that for intraday trading, it is difficult to avoid a direct and indirect cost of about 5080bps per day, in each case, leaving a conservative estimate of total costs to be approximately 160bps per day. We realise an overall average cost per period of 2.17bps, while the average cost per day assuming we trade for 85 periods throughout each day is roughly 184bps (85*2.17) for intradaydaily trading.
Volatility Estimation for Transaction Costs
In this section, we will discuss different methods for calculating the estimates for volatility () for daily and intraday data in the squareroot formula (section 2.5).
Daily Data Estimation The volatility of daily prices at each day is taken to be the standard deviation of closing prices over the last 90 days. If 90 days have not passed, then the standard deviation will be taken over the number of days available so far.
Intraday Data Estimation The volatility for each intraday time bar on day is dependent on the time of day. For the first 15 time bars, the volatility is taken to be a forecast of a GARCH(1,1) model which has been fitted on the last 60 returns of the previous day . The reason for this choice is that the market is very volatile during the opening hour as well as the fact that there will be relatively few data points to utilise when computing the volatility. The rest of the day’s volatility estimates are computed using the Realised Volatility (RV) method (Andersen et al. (2001)). RV is one of the more popular methods for estimating volatility of highfrequency returns
Assume that the instantaneous returns of observed log stock prices () with unobservant latent volatility () scaled continuously through time by a standard Wiener process () can be generated by the continuous time martingale (Poon (2008))
(11) 
It follows that the conditional variance of the single period returns, are:
(12) 
This is also known as the integrated volatility for the period to . Suppose the sampling frequency of the tick data into regularly spaced time intervals is denoted by so that between period and there are continuously compounded returns, then . Hence, we can estimate the Realised Volatility (RV) based on intraday returns between periods and as
(13) 
The argument here is that, provided we sample at frequent enough time steps (), the volatility can be observed theoretically from the sample path of the return process and hence (Karatzas and Shreve (1991); Poon (2008))
(14) 
which says that the RV of a sequence of returns asymptotically approaches the integrated volatility and hence the RV is a reasonable estimate of current volatility levels.
3 Testing for Statistical Arbitrage
To test the overall trading strategy for statistical arbitrage, we implement a novel statistical test originally proposed by Hogan et al. (2004) and later modified by Jarrow et al. (2012), by applying it to the overall strategy’s profit and losses PL. The idea is to axiomatically define the conditions under which a statistical arbitrage exists and assume a parametric model for incremental trading profits in order to form a null hypothesis derived from the union of several subhypotheses which are formulated to facilitate empirical tests of statistical arbitrage. The modified test, proposed by Jarrow et al. (2012), called the Min test, is derived from a set of restrictions imposed on the parameters defined by the statistical arbitrage null hypothesis and is applied to a given trading strategy to test for statistical arbitrage. The Min statistic is argued to provide a much more efficient and powerful statistical test compared to the Bonferroni inequality used in Hogan et al. (2004). The lack of statistical power is reduced when the number of subhypotheses increases and as a result, the Bonferroni approach is unable to reject an incorrect null hypothesis leading to a large Type II error.
To set the scene and introduce the concept of a statistical arbitrage, suppose that in some economy, a stock (portfolio)
Definition 1 (Statistical Arbitrage (Hogan et al. (2004); Jarrow et al. (2012))).
A statistical arbitrage is a zerocost, selffinancing trading strategy () with cumulative discounted trading profits such that:

,



In other words, a statistical arbitrage is a trading strategy that 1) has zero initial cost, 2) in the limit has positive expected discounted cumulative profits, 3) in the limit has a probability of loss that converges to zero and 4) variance of negative incremental trading profits (losses) converge to zero in the limit. It is clear that deterministic arbitrage stemming from traditional financial mathematics is in fact a special case of statistical arbitrage (De Wit (2013)).
In order to test for statistical arbitrage, assume that the incremental discounted trading profits evolve over time according to the process
(15) 
where . There are two cases to consider for the innovations: 1) i.i.d N(0,1) normal uncorrelated random variables satisfying or 2) follows an MA(1) process given by:
(16) 
in which case the innovations are nonnormal and correlated. Here, is an i.i.d. N(0,1) normal uncorrelated random variable. It is also assumed that and, in the case of our algorithm, = 0. We will refer to the first model (normal uncorrelated innovations) as the unconstrained mean (UM) model and the second model (nonnormal and correlated innovations) as the unconstrained mean with correlation (UMC) model. Furthermore, we refer to the corresponding models with as the constrained mean (CM) and constrained mean with correlation (CMC) respectively, which assume constant incremental profits over time, and hence have an incremental profit process given by:
(17) 
The discounted cumulative trading profits for the UM model at terminal time , discounted back to the initial time, which are generated by a trading strategy are given by
(18) 
From equation 18, it is straightforward to show that the loglikelihood function for the discounted incremental trading profits is given by:
(19) 
The probability of a trading strategy generating a loss after periods is as follows (Jarrow et al. (2012))
(20) 
where denotes the cumulative standard normal distribution function. For the CM model, equation 20 is easily adjusted by setting and equal to zero. This probability converges to zero at a rate that is faster than exponential.
As mentioned previously, to facilitate empirical tests of statistical arbitrage under Definition 1, a set of subhypotheses are formulated to impose a set of restrictions on the parameters of the underlying process driving discounted cumulative incremental trading profits and are as follows:
Proposition 3.1 (UM Model Hypothesis (Jarrow et al. (2012))).
Under the four axioms defined in Definition 1, a trading strategy generates a statistical arbitrage under the UM model if the discounted incremental trading profits satisfy the intersection of the following four subhypotheses jointly: i.) , ii.) or , iii.), and .
An intersection of the above subhypotheses defines a statistical arbitrage, and as by De Morgan’s Laws
Proposition 3.2 (UM Model Alternative Hypothesis (Hogan et al. (2004); Jarrow et al. (2012))).
Under the four axioms defined in Definition 1, a trading strategy does not generate a statistical arbitrage if the discounted incremental trading profits satisfy any one of the following four subhypotheses: i.) , ii.) or , iii.) , and iv.)
The null hypothesis is not rejected provided that a single subhypothesis holds. The Min test is then used to test the above null hypothesis of no statistical arbitrage by considering each subhypothesis separately using the tstatistics and , where the hats denote the Maximum Likelihood Estimates (MLE) of the parameters. The Min statistic is defined as (Jarrow et al. (2012))
(21) 
The intuition is that the Min statistic returns the smallest test statistic which is the subhypothesis which is closest to being accepted. The no statistical arbitrage null is then rejected if Min where depends on the significance level of the test which we will refer to as . Since the probability of rejecting cannot exceed the significance level , we have the following condition for the probability of rejecting the null at the significance level
(22) 
What remains is for us to compute the critical value . We will implement a Monte Carlo simulation procedure to compute which we describe in more detail in section 3.1 step 5 below.
3.1 Outline of the Statistical Arbitrage Test Procedure
The steps involved in testing for statistical arbitrage are outlined below:

Trading increments : From the vector of cumulative trading profits and losses, compute the increments where .

Perform MLE: Compute the likelihood function, as given in equation 19, and maximise it to find the estimates of the four parameters, namely, and . The loglikelihood function will obviously be adjusted depending on whether the CM or UM test is implemented. We will only consider the CM test in this study. Since MATLAB’s builtin constrained optimization algorithm
^{25} only performs minimization, we minimise the negative of the loglikelihood function i.e. maximise the loglikelihood. 
Standard errors: From the estimated parameters in the MLE step above, compute the negative Hessian estimated at the MLE estimates which is indeed the Fisher Information (FI) matrix denoted by . In order to compute the Hessian, the analytical partial derivatives are derived from equation 19. Standard errors are then taken to be the square roots of the diagonal elements of the inverse of since the inverse of the Fisher information matrix is an asymptotic estimator of the covariance matrix.

Min statistic: Compute the tstatistics for each of the subhypotheses which are given by and and hence the resulting Min statistic given by equation 21. Obviously, , and will not need to be considered for the CM test.

Critical values: Compute the critical value at the significance level using the Monte Carlo procedure (uncorrelated normal errors) and Bootstrapping (correlated nonnormal errors)

CM model First, simulate 5000 different profit process using equation 17 with
^{26} . For each of the 5000 profit processes, perform MLE to get estimated parameters, the associated tstatistics and finally the Min statistics. is the taken to be the 1 quantile of the resulting distribution of Min values.


Pvalues: Compute the empirical probability of rejecting the null hypothesis at the significance level using equation 22 by utilising the critical value from the previous step and the simulated Min statistics.

nPeriod Probability of Loss: Compute the probability of loss after periods for each and observe the number of trading periods it takes for the probability of loss to converge to zero (or below 5% as in the literature). This is done by computing the MLE estimates for the vector () for each given and substituting these estimates into equation 20.
3.2 Estimates of the Probability of Backtest Overfitting (PBO)
Bailey et al. (2014) heavily criticise recent studies which claim
to have designed profitable investment or trading strategies since many of these studies are only based on insample (IS) statistics, without evaluating outofsample (OOS) performance. We briefly addressed this concern by computing estimate of the probability of backtest overfitting (PBO) using the combinatorially symmetric crossvalidation (CSCV) procedure outlined in Bailey et al. (2016). Typically, an investor/researcher will run many () trial backtests to select the parameter combinations which optimise the performance of the algorithm (usually based on some performance evaluation criterion such as the Sharpe Ratio). The idea is to perform CSCV on the matrix of performance series over time of length
Here, we must be clear that when we refer to IS, we do not mean the “training set” per say, during which the moving average lookback parameters were calculated for example. Rather, we refer to IS as being the subset of observations utilised in selecting the optimal strategy from the backtest trials.
In the case of the algorithm proposed in this study, since the large set of trialled parameters form the basis of the learning algorithm in the form of the experts, we cannot observe the effect of different parameters settings on the overall strategy, as these are already built into the underlying algorithm. Rather, we will run trial backtest simulations on independent subsets of historical data to get an idea of how the algorithm performs on different subsets of unseen data. We can then implement the CSCV procedure on the matrix of profits and losses resulting from the trials to recover a PBO estimate. Essentially, there is no training of parameters taking place in our model, as all parameter combinations are considered, and the weights of the performance weighted average of the expert’s strategies associated with the different parameter combinations are “learnt”.
More specifically, we choose a backtest length for each subset and split the entire history of OHLCV data into subsets of this length. The learning algorithm is then implemented on each subset to produce profit and loss time series. Note that the subsets will be completely independent from one another as there is no overlapping of the data that each separate simulation is run on. The results from the simulations are presented in table 1 below.
PBO  
Daily  30  60 days  1.4% 
Intradaydaily  22  3 days  11.4% 
4 The Data
4.1 Daily Data
The daily data is sourced from Thomson Reuters and contains data corresponding to all stocks listed on the JSE Top 40
4.2 IntradayDaily Data
Bloomberg is the source of all tick (intraday) data used in this paper. The data set consists of 30 of the Top 40 stocks on the JSE from 02012018 to 29062018. The data is then sampled at 5minute intervals to create an OHLCV entry for all 5minute intervals over the 6month period. We remove the first 10 minutes and last 20 minutes of the continuous trading session (9:0016:50) as the market is relatively illiquid and volatile during these times which may lead to spurious trade decisions. We are thus left with 88 OHLCV entries for each stock on any given day. In addition to the intraday data, daily OHLCV data for the specified period is required for the last transaction on any given day. As in the daily data case, we make use of the STeFI index as the riskfree asset, and hence the daily entries for the STeFI index are included in this data set. The data was sourced from a Bloomberg terminal using the R Bloomberg API, Rblpapi, and all data processing is done in MATLAB to get the data into the required form for the learning algorithm.
5 Results and Analysis
5.1 Daily Data
In this section, we implement the various algorithms described above in order to plot a series of graphs for daily JSE Top 40 data as discussed in section 4 above. We will plot five different graphs: first is the overall portfolio wealth over time which corresponds to as described above, second, the cumulative profit and losses over time , third, the relative population wealth of experts corresponds to the wealth accumulated over time by each of the experts competing for wealth in the algorithm and finally, the relative population wealth of the strategies which takes the mean over all experts for each given trading strategy to get an accumulated wealth path for each technical trading rule.
For the purpose of testing the learning algorithm, we will identify the 15 most liquid stocks over one year prior to the start of active trading. The stocks, ranked by liquidity, are as follows: FSRJ.J, OMLJ.J, CFRJ.J, MTNJ.J, SLMJ.J, NTCJ.J, BILJ.J, SBKJ.J, WHLJ.J, AGLJ.J, SOLJ.J, GRTJ.J, INPJ.J, MNDJ.J and RMHJ.J.
No Transaction Costs
Barring transaction costs, it’s clear that the portfolio makes favourable cumulative returns on equity over the sixyear period as is evident in figure 1. The performance of the online learning algorithm (blue) is similar to that of the benchmark BCRP strategy (orange) which is promising as the original literature proves that the algorithm should track such a benchmark in the longrun. The figure inset in figure 1 illustrates that the overall strategy provides consistent positive trading profits over the entire trading horizon. figure 2 shows the expert wealth for all experts and figure 2 shows the mean expert wealth for each strategy. These figures show that on average, the underlying experts perform fairly poorly compared to the overall strategy however there is evidence that some experts make satisfactory returns over the period.
table 2 and table 3 provide the group summary statistics of the terminal wealth’s of experts and of the expert’s profits and losses over the entire trading horizon respectively where experts are grouped based on their underlying strategy . The online ZAnticor
Strategy  Mean (mean rank)  St. Dev.  Min  Max 
EMA Xover  0.8739 (673.6343)  0.1767  0.5216  1.4493 
Ichimoku Kijun Sen  0.9508 (623.3194)  0.2313  0.5424  1.5427 
MACD  0.9504 (657.7639)  0.1750  0.5601  1.6065 
Moving Ave Xover  0.8895 (632.6944)  0.1930  0.5206  1.4505 
ACC  1.0994 (736.5833)  0.3131  0.5283  1.9921 
BOLL  1.0499 (569.1944)  0.3536  0.6076  1.7746 
Fast Stochastic  0.9995 (778.6111)  0.3699  0.6006  1.8555 
MARSI  1.0723 (639.3611)  0.2081  0.6947  1.6917 
MOM  1.0403 (681.4444)  0.1353  0.7349  1.3595 
Online AntiZBCRP  0.7579 (731.9444)  0.1935  0.4649  1.0924 
Online ZAnticor  1.3155 (694.5278)  0.4388  0.6363  2.3886 
Online ZBCRP  1.2818 (652.8611)  0.2637  0.8561  1.8341 
PROC  0.8963 (718.0833)  0.1631  0.6305  1.2161 
RSI  1.1339 (757.3889)  0.2544  0.6440  1.7059 
SAR  0.7314 (654.1111)  0.0619  0.6683  0.8683 
Slow Stochastic  1.1135 (793.2222)  0.3302  0.6955  2.1023 
Williams %R  0.9416 (728.6944)  0.3150  0.4662  1.5131 
Strategy  Mean  St. Dev.  Min  Max 
EMA Xover  0.00010  0.00633  0.09745  0.08074 
Ichimoku Kijun Sen  0.00004  0.00723  0.10467  0.06157 
MACD  0.00003  0.00725  0.15993  0.08074 
Moving Ave Xover  0.00009  0.00644  0.15993  0.11482 
ACC  0.00007  0.00760  0.15993  0.08028 
BOLL  0.00002  0.00711  0.06457  0.06480 
Fast Stochastic  0.00001  0.00847  0.06469  0.06279 
MARSI  0.00006  0.00612  0.06788  0.06527 
MOM  0.00004  0.00603  0.06051  0.15820 
Online AntiZBCRP  0.00022  0.00773  0.09847  0.09336 
Online ZAnticor  0.00021  0.00759  0.06475  0.09773 
Online ZBCRP  0.00021  0.00771  0.09336  0.09847 
PROC  0.00007  0.00733  0.10467  0.09745 
RSI  0.00010  0.00666  0.06460  0.09745 
SAR  0.00023  0.00724  0.10467  0.08724 
Slow Stochastic  0.00009  0.00809  0.06480  0.06820 
Williams %R  0.00006  0.00815  0.06820  0.06317 
figure 3 illustrates the 2D plot of the latent space of a Variational Autoencoder (VAE) for the time series’ of wealth’s of all the experts with experts coloured by object cluster. It is not surprising that the expert’s wealth time series’ show quite welldefined clusters in terms of the stock which experts trade in their portfolio as the stocks that each expert trades will be directly related to the decisions they make given the incoming data and hence the corresponding returns (wealth) they achieve.
To provide some sort of comparison, in figure 3 we plot the same results as above but this time we colour the experts in terms of their underlying strategy . The VAE seems to be able to pick up much clearer similarities (dissimilarities) between the experts based on the stocks they trade compared to which strategy they utilise providing evidence that the achieved wealth has a much stronger dependence on the stock choice rather than the chosen strategy. This may be an important point to consider and gives an indication that it may be worth considering more sophisticated ways to choose the stocks to trade rather than developing more sophisticated/profitable strategies. A discussion on the features that should be considered by a quantitative investment manager in assessing an assets usefulness is provided in Samo and Hendricks (2018).
Next, we implement the CM test for statistical arbitrage on the daily cumulative profits and losses (PL) for the strategy without transaction costs. In order to have a result that is synonymous with Jarrow et al. (2012), we choose a period of 400 days to test our strategy. We test the realised profits and losses for the 400day period stretching from the trading day until the trading day. This is to allow for the algorithm to initiate and leave enough time for majority of the experts to have sufficient data to begin making trading decisions. Having simulated the 5000 different Min statistics as in section 3.1 step 5a using simulations of the profit process in equation 17, figure 4 illustrates the histogram of Min values. The critical value is then computed as the 0.95quantile of the simulated distribution which refers to a significance level of and is illustrated by the red vertical line. The resulting critical value is . The Min resulting from the realised incremental profits and losses of the overall strategy is 3.0183 (vertical green line). By equation 22, we recover a pvalue of zero. Thus, we can conclude that there is significant evidence to reject the null of no statistical arbitrage at the 5% significance level.
In addition to testing for statistical arbitrage, we also report the number of days it takes for the probability of loss of the strategy to decline below 5% using equation 22 adjusted for the case of the CM model. As discussed in section 3.1 step 7, for each , we perform MLE for to get the parameter estimates. We then substitute these estimates into equation 22 to get an estimate of the probability of loss for the period. This is all done in terms of the CM model. The figure inset of figure 4 illustrates the probability of loss for each of the first 25 trading days, where we compute the probability of loss of the profit and loss process from the first trading period up until the period for each . As is evident from the figure inset, it takes roughly 10 periods for the probability of loss to converge below 5%.
Transaction Costs
In this section we reproduce the results from above but this time including transaction costs for daily trading as discussed in section 2.5. Once direct and indirect (section 2.5) costs have been computed, the idea is to subtract off the transaction cost from the profit and losses of each day and compound the resulting value onto to get the wealth for period . These daily profit and losses are added to get the cumulative profit and loss PL.
It is clear from the inset of figure 5(a), which illustrates the profits and losses (PL) of the overall strategy less the transaction costs for each period, that consistent losses are incurred when transaction costs are incorporated. Furthermore, there is no evidence to reject the no statistical arbitrage null hypothesis as the Min statistic resulting from the overall strategy is well below the critical value at the percentile of the histogram as illustrated in figure 5(b). In addition to this, although the probability of loss of the strategy with transaction costs included initially converges to zero, it eventually settles on a value of one. This is illustrated in the inset of figure 5(b).
Considering the above evidence contained in figure 5(a), figure 5(b) and its associated figure inset, the overall strategy does not survive historical back tests in terms of profitability when transaction costs are considered and may not be well suited for an investor utilising daily data whom has a limited time to make adequate profits. This is in agreement with Schulmeister (2009) in that there is a strong possibility that stock price and volume trends have shifted to higher frequencies than the daily time scale, and resultantly, trading strategies’ profits have, over time, diminished on such time scales.
5.2 IntradayDaily Data
Below we report the results of the algorithm implementation for a combination of intraday and daily JSE data as discussed in section 2.3. We run the algorithm on the OHLCV data of 15 most liquid stocks from a set of 30 of the JSE Top 40. Liquidity is calculated in terms of average daily trade volume for the first 4 days of the period 02012018 to 09032018. The set of 15 stocks is as follows: FSR:SJ, GRT:SJ, SLM:SJ, BGA:SJ, SBK:SJ, WHL:SJ, CFR:SJ, MTN:SJ, DSY:SJ, IMP:SJ, APN:SJ, RMH:SJ, AGL:SJ, VOD:SJ and BIL:SJ. The remaining 40 days’ data for the aforementioned period is utilised to run the learning algorithm on. As in the daily data implementation, we again analyse the two cases of trading, with and without transaction costs, which we report in the following two subsections below.
No Transaction Costs
Without transaction costs, the cumulative wealth achieved by the overall strategy, illustrated in figure 6(a) evolves similarly to an exponential function over time. The associated profits and losses are displayed in the figure inset of figure 6(a). Incremental profits and losses are obviously a lot smaller compared to the daily data cases resulting in a much smoother function in comparison to the daily data case (figure 1).
table 4 is the intradaydaily analogue of table 2. In this case, the exponential moving crossover strategy (EMA Xover) produces the expert with the greatest wealth and acceleration (ACC) the expert with the least terminal wealth. Exponential moving crossover also produces experts with the highest variation in terminal wealth’s. Price rate of change (PROC) comfortably provides the best mean ranking experts among all experts among all other strategies, however, ZBCRP produces experts with highest mean terminal wealth.
Again, as for the daily data case, we implement a test for statistical arbitrage for intradaydaily trading without transaction costs for 400 trading periods starting from the time bar of the 2nd trading day
Strategy  Mean (mean rank)  St. Dev.  Min  Max 
EMA Xover  1.0024 (662.7639)  0.0094  0.9801  1.0375 
Ichimoku Kijun Sen  0.9989 (710.3750)  0.0085  0.9663  1.0303 
MACD  0.9995 (684.8704)  0.0067  0.9720  1.0202 
Moving Ave Xover  1.0012 (708.7824)  0.0058  0.9766  1.0204 
ACC  0.9953 (831.3333)  0.0079  0.9646  1.0048 
BOLL  0.9974 (712.9722)  0.0069  0.9787  1.0089 
Fast Stochastic  0.9991 (711.4167)  0.0040  0.9871  1.0085 
MARSI  0.9973 (736.2500)  0.0062  0.9824  1.0094 
MOM  0.9982 (723.1389)  0.0087  0.9700  1.0082 
Online AntiZBCRP  0.9980 (597.3056)  0.0062  0.9828  1.0103 
Online ZAnticor  1.0015 (655.7778)  0.0058  0.9896  1.0180 
Online ZBCRP  1.0031 (566.5833)  0.0069  0.9898  1.0149 
PROC  0.9980 (445.1389)  0.0064  0.9814  1.0140 
RSI  0.9997 (535.5833)  0.0065  0.9861  1.0171 
SAR  0.9945 (499.7222)  0.0053  0.9790  1.0005 
Slow Stochastic  1.0007 (508.5278)  0.0048  0.9927  1.0173 
Williams %R  1.0020 (536)  0.0034  0.9957  1.0133 
The figure inset of figure 6(b) illustrates the probability of loss for each of the first 25 periods of the 400 periods as discussed in the above paragraph. It takes roughly an hour (13 periods) for the probability of loss to converge to zero.
Transaction Costs
We now report the results of the algorithm run on the same intradaydaily data as in the subsection above but this time with transaction costs incorporated (see section 2.5). figure 7(a) and the figure inset illustrate the overall cumulative portfolio wealth (S) and profits and losses (PL) respectively for intradaydaily trading with transaction costs. For comparative reasons, the axes are set to be equivalent to those illustrated in the case of no transaction costs (figure 7(a) and the figure inset). Surprisingly, even with a total daily trading cost (direct and indirect) of roughly 130bps, which is a fairly conservative approach, the algorithm is able to make satisfactory returns, which is in contrast to the daily trading case (figure 5(a)). Furthermore, figure 7(b) provides significant evidence to reject the no statistical arbitrage null hypothesis and returns a Min statistic almost identical (4.32 in the transaction costs case compared to 3.87) to that of the case of no transaction costs (figure 6(b)). Even more comforting, is the fact that even when transaction costs are considered, the probability of loss per trading period converges to zero, albeit slightly slower (roughly 2 hours or 31 trading periods) than the case of no transaction costs (roughly 1 hour or 13 trading periods, as illustrated in the inset of figure 6(b)).
The above results for intradaydaily trading are in complete contrast to the case of daily trading with transaction costs, whereby the no statistical arbitrage null could not be rejected, the probability of loss did not converge to zero and remain there, and trading profits steadily declined over the trading horizon. This suggests that the proposed algorithm may be much better suited to trading at higher frequencies. This is not surprising and is in complete agreement with Schulmeister (2009) who argues that the profitability of technical trading strategies had declined over from 1960, before becoming unprofitable from the 1990’s. A substantial set of technical trading strategies are then implemented on 30minute data and the evidence suggests that such strategies returned adequate profits between 1983 and 2007 however the profits declined slightly between 2000 and 2007 compared to the 1980’s and 1990’s. This suggests that markets may have become more efficient and even the possibility that stock price and volume trends have shifted to even higher frequencies than 30 minutes (Schulmeister (2009)). This supports the choice to trade the algorithm proposed in this paper on at least 5minute OHLCV data and reinforces our conclusion that ultimately, the most desirable implementation of the algorithm would be in volumetime, which is best suited for high frequency trading.
6 Conclusion
We have developed a learning algorithm built from a base of technical trading strategies for the purpose of trading equities on the JSE that is able to provide favourable returns when ignoring transaction costs, under both daily and intraday trading conditions. The returns are reduced when transaction costs are considered in the daily setting, however there is sufficient evidence to suggest that the proposed algorithm is really well suited to intraday trading.
This is reinforced by the fact that there exists meaningful evidence to reject a carefully defined null hypothesis of no statistical arbitrage in the overall trading strategy even when a reasonably aggressive view is taken on intraday trading costs. We are also able to show that it in both the daily and intradaydaily data implementations that the probability of loss declines below 5% relatively quickly which strongly suggests that the algorithm is well suited for a trader whose preference or requirement is to make adequate returns in the shortrun. It may well be that the statistical arbitrages we have identified intraday are artefactsâ from “price distorters” (Moffit (2017)) rather than legitimate mispricing in the sense of majority views relative to a trading minority and hence cannot be easily traded out of profit. This suggests that it can be important to try unpack the difference between the structural mispricingâs relative to statistical arbitrages—that is outside of the scope of the current work and cannot be determined using the tests implemented in this work.
The superior performance of the algorithm for intraday trading is in agreement with Schulmeister (2009), who concluded that while the daily profitability of a large set of technical trading strategies has steadily declined since 1960 and has been unprofitable since the onset of the 1990’s, trading the same strategies on 30minute (intraday) data between 1983 and 2007 has produced decent average gross returns. However, such returns have slowly declined since the early 2000’s. In conclusion, the proposed algorithm is much better suited to trading at higher frequencies; but we are also aware that over time tradings strategies that are not structural in nature are slowly arbitraged away through overcrowding.
We are also cognisant of the fact that intraday trading will require a large component of accumulated trading profits to finance frictions, concretely to fund direct, indirect and business model costs (Loonat and Gebbie (2018)). For this reason, we are careful to remain sceptical with this class of algorithms longrun performance when trading with real money in a live trading environment for profit. The current design of the algorithm is not yet ready to be traded on live market data, however with some effort it is easily transferable to such use cases given the sequential nature of the algorithm and its inherent ability to receive and adapt to new incoming data while making appropriate trading decisions based on the new data. Concretely, the algorithm should be deployed in the context of volumetime trading rather than the calendar time context considered in this work.
Possible future work includes implementing the algorithm in volumetime, which will be best suited for dealing with a high frequency implementation of the proposed algorithm, given the intermittent nature of orderflow. We also propose replacing the learning algorithm with an online (adaptive) neural network that has the ability to predict optimal holding times of stocks. Another interesting line of work that has been considered is to model the population of trading experts as competing in a predatorprey environment (Farmer (2000); Johnson et al. (2013)). This was an initial key motivation for the research project, to find which collections of technical trading strategies can be grouped collectively and how these would interact with each other. This can include using cluster analysis to group, or separate trading experts, based on their similarities and dissimilarities, and hence make appropriate inferences regarding their interactions and behaviours at the level of collective and emergent dynamics. This can in turn be used for cluster based approaches for portfolio control.
Acknowledgements
NM and TG would like to thank the Statistical Finance Research Group in the Department of Statistical Sciences for various useful discussions relating to the work. In particular, we would like to thank Etienne Pienaar, Lionel Yelibi and Duncan Saffy. We thank Michael Gant for his help with developing some of the strategies and for numerous valuable discussions with regards to the statistical arbitrage test.
Funding
TG would like to thank UCT FRC for funding (UCT fund 459282).
Supplemental material
Please access the supplemental material at Murphy (2019).
Appendix A Technical Indicators and Trading Rules
We follow Creamer and Freund (2010); Kestner (2003) in introducing and describing some of the more popular technical analysis indicators as well as a few others that are widely available. We also provide some trading rules which use technical indicators to generate buy, sell and hold signals.
\topruleIndicator  Description  Calculation  
\toprule  The Simple Moving Average (SMA) is the mean of the closing prices over the last trading days. The smaller the value of , the closer the moving average will fit to the price data. 


The Exponential Moving Average (EMA) uses today’s close price, yesterday’s moving average value and a smoothing factor (). The smoothing factor determines how quickly the exponential moving average responds to current market prices (Kestner (2003)). 


HH(n)  The Highest High (HH) is the greatest high price in the last periods and is determined from the vector of the highprices of the last periods. 
Given the high prices of last periods:


LL(n)  Lowest Low (LL) is the smallest low price in the last periods and is found from the vector of low prices in the last periods. 
Givne the low prices of the last periods:


The Ichimoku Kinko Hyo (IKH) (at a glance equilibrium chart) system consists of five lines and the Kumo (cloud). 
Tenkansen (Conversion Line):


Momentum (MOM) gives the change in the closing price over the past periods. 


Acceleration(ACC) measures the change in momentum between two consecutive periods and 


The Moving Average Convergence/Divergence (MACD) oscillator attempts to determine whether traders are accumulating stocks or distributing stocks. It is calculated by computing the difference between a shortterm and a longterm moving average. A signal line is computed by taking an EMA of the MACD and determines the instances to buy (oversold) and sell (overbought) when used in conjunction with the MACD 
Longterm EWM:


and  Fast Stochastic Oscillator shows the location of the closing price relative to the highlow range, expressed as a percentage, over a given number of periods as specified by a lookback parameter. 


and  The Slow Stochastic Oscillator is very similar to the fast stochastic indicator and is in fact just a moving average of the fast stochastic indicator. 


Relative Strength Index (RSI) compares the periods that stock prices finish up (closing price higher than the previous period) against those periods that stock prices finish down (closing price lower than the previous period). 


Moving Average Relative Strength Index (MARSI) is an indicator that smooths out the action of RSI indicator. 


Bollinger(Boll) bands uses a SMA () as it’s reference point (known as the median band) with regards to the upper and lower Bollinger bands denoted by and respectively and are calculated as functions of standard deviations (). 
Median band:


The rate of change of the time series of closing prices over the last periods expressed as a percentage. 


Williams Percent Range (Williams %R) is calculated similarly to the fast stochastic oscillator and shows the level of the close relative to the highest high in the last periods. 


Parabolic Stop and Reverse (SAR), developed by J. Wells Wilder, is a trend indicator formed by a parabolic line made up of dots at each time step (Wilder (1978)). The dots are formed using the most recent Extreme Price and an acceleration factor (AF), 0.02, which increases each time a new Extreme Price (EP) is reached. The AF has a maximum value of 0.2 to prevent it from getting too large. Extreme Price represents the highest (lowest) value reached by the price in the current uptrend (downtrend). The acceleration factor determines where in relation to the price the parabolic line will appear by increasing by the value of the AF each time a new EP is observed and thus affects the rate of change of the Parabolic SAR. 
Calculating the SAR indicator:


\botrule 
\topruleTrading Rule  Decision  Condition \topruleMoving Average Crossover 