Uncovering the Internal Structure of the Indian Financial Market:
Cross-correlation behavior in the NSE
The cross-correlations between price fluctuations of 201 frequently traded stocks in the National Stock Exchange (NSE) of India are analyzed in this paper. We use daily closing prices for the period 1996-2006, which coincides with the period of rapid transformation of the market following liberalization. The eigenvalue distribution of the cross-correlation matrix, , of NSE is found to be similar to that of developed markets, such as the New York Stock Exchange (NYSE): the majority of eigenvalues fall within the bounds expected for a random matrix constructed from mutually uncorrelated time series. Of the few largest eigenvalues that deviate from the bulk, the largest is identified with market-wide movements. The intermediate eigenvalues that occur between the largest and the bulk have been associated in NYSE with specific business sectors with strong intra-group interactions. However, in the Indian market, these deviating eigenvalues are comparatively very few and lie much closer to the bulk. We propose that this is because of the relative lack of distinct sector identity in the market, with the movement of stocks dominantly influenced by the overall market trend. This is shown by explicit construction of the interaction network in the market, first by generating the minimum spanning tree from the unfiltered correlation matrix, and later, using an improved method of generating the graph after filtering out the market mode and random effects from the data. Both methods show, compared to developed markets, the relative absence of clusters of co-moving stocks that belong to the same business sector. This is consistent with the general belief that emerging markets tend to be more correlated than developed markets.
“Because nothing is completely certain but subject to fluctuations, it is dangerous for people to allocate their capital to a single or a small number of securities. […] No one has reason to expect that all securities …will cease to pay off at the same time, and the entire capital be lost.” – from the 1776 prospectus of an early mutual fund in the Netherlands rouwenhorst05
As evident from the above quotation, the correlation between price movements of different stocks has long been a topic of vital interest to those involved with the study of financial markets. With the recent understanding of such markets as examples of complex systems with many interacting components, these cross-correlations have been used to infer the existence of collective modes in the underlying dynamics of stock prices. It is natural to expect that stocks which strongly interact with each other will have correlated price movements. Such interactions may arise because the companies belong to the same business sector (i.e., they compete for the same set of customers and face similar market conditions), or they may belong to related sectors (e.g., automobile and energy sector stocks would be affected similarly by rise in gasoline prices), or they may be owned by the same business house and therefore perceived by investors to be linked. In addition, all stocks may respond similarly to news breaks that affect the entire market (e.g., the outbreak of a war) and this induces market-wide correlations. On the other hand, information that is related only to a particular company will tend to decorrelate its price movement from those of others.
Thus, the effects governing the cross-correlation behavior of stock price fluctuations can be classified into (i) market (i.e., common to all stocks), (ii) sector (i.e., related to a particular business sector) and (iii) idiosyncratic (i.e., limited to an individual stock). The empirically obtained correlation structure can then be analyzed to find out the relative importance of such effects in actual markets. Physicists investigating financial market structure have focussed on the spectral properties of the correlation matrix, with pioneering studies investigating the deviation of these properties from those of a random matrix, which would have been obtained had the price movements been uncorrelated. It was found that the bulk of the empirical eigenvalue distribution matches fairly well with those expected from a random matrix, as does the distribution of eigenvalue spacings laloux99 ; plerou99 . Among the few large eigenvalues that deviated from the random matrix predictions, the largest represent the influence of the entire market common to all stocks, while the remaining eigenvalues correspond to different business sectors gopikrishnan01 , as indicated by the composition of the corresponding eigenvectors plerou02 . However, although models in which the market is assumed to be composed of several correlated groups of stocks is found to reproduce many spectral features of the empirical correlation matrix noh00 , one needs to filter out the effects of the market-wide signal as well as noise in order to identify the group structure in an actual market. Recently, such filtered matrices have been used to reveal significant clustering among a large number of stocks from the NYSE kim05 .
The discovery of complex market structure in developed financial markets as NYSE and Japan utsugi04 , brings us to the question of whether emerging markets show similar behavior. While it is generally believed that stock prices in developing markets tend to be relatively more correlated than the developed ones morck00 , there have been very few studies of the former in terms of analysing the spectral properties of correlation matrices wilcox04 ; kulkarni05 ; jung06 ; cukur07 ; sinha06 111Most studies of correlated price movements in emerging markets have looked at synchronicity which measures the incidence of similar (i.e., up or down) price movements across stocks, and is not the same as correlation which measures relative magnitude of the change as well as its direction, although the two are obviously closely related..
In this paper we present the first detailed study of cross-correlations in the Indian financial market over a significant period of time, that coincides with the decade of rapid transformation of the recently liberalized economy into one of the fastest growing in the world. The prime motivation for our study of one of the largest emerging markets is to see if there are significant deviations from developed markets in terms of the properties of its collective modes. As already shown by us sinha06 ; pan06 ; pan06a the return distribution in Indian markets follows closely the “inverse cubic law” that has been reported in developed markets. If therefore, deviations are observed in the correlation properties, these would be almost entirely due to differences in the nature of interactions between stocks. Indeed, we do observe that the Indian market shows a higher degree of correlation compared to, e.g., NYSE. We present the hypothesis that this is due to the dominance of the market-wide signal and relative absence of significant group structure among the stocks. This may indicate that one of the hallmarks of the transition of a market from emerging to developed status is the appearance and consolidation of distinct business sector identities.
2 The Indian Financial Market
There are 23 different stock markets in India. The largest of these is the National Stock Exchange (NSE) which accounted for more than half of the entire combined turnover for all Indian financial markets in 2003-04 ismr , although its market capitalization is comparable to that of the second largest market, the Bombay Stock Exchange. The NSE is considerably younger than most other Indian markets, having commenced operations in the capital (equities) market from Nov 1994. However, as of 2004, it is already the world’s third largest stock exchange (after NASDAQ and NYSE) in terms of transactions ismr . It is thus an excellent source of data for studying the correlation structure of price movements in an emerging market.
Description of the data set. We have considered the daily closing price time series of stocks traded in the NSE available from the exchange web-site website . For cross-correlation analysis, we have focused on daily closing price data of NSE stocks from Jan 1, 1996 to May 31, 2006, which corresponds to working days (the individual stocks, along with the business sector to which they belong, are given in Table 1). The selection of the stocks was guided by the need to minimise missing data in the time-series, a problem common to data from other emerging markets wilcox04 . In our data, 45 stocks have no missing data, while from the remaining stocks, the one having the largest fraction of missing data has price data missing for less than of the total period covered 222In case of a date with missing price data, it is assumed that no trading took place on that day, so that, the price remained the same as the preceding day..
3 The Return Cross-Correlation Matrix
To measure correlation between the price movements across different stocks, we first need to measure the price fluctuations such that the result is independent of the scale of measurement. For this, we calculate the logarithmic return of price. If is the stock price of the th stock at time , then the (logarithmic) price return is defined as
For daily return, = 1 day. By subtracting the average return and dividing the result with the standard deviation of the returns (which is a measure of the volatility of the stock), , we obtain the normalized price return,
where represents time average. Once the return time series for stocks over a period of days are obtained, the cross-correlation matrix is calculated, whose element , represents the correlation between returns for stocks and .
If the time series are uncorrelated, then the resulting random correlation matrix, also known as a Wishart matrix, has eigenvalues distributed according to sengupta99 :
with , such that . The bounds of the distribution are given by and . For the NSE data, , which implies that the distribution should be bounded at in the absence of any correlations. As seen in Fig. 1 (left), the bulk of the empirical eigenvalue distribution indeed occurs below this value. However, a small fraction () of the eigenvalues deviate from the random matrix behavior, and, by analyzing them we should be able to obtain an understanding of the interaction structure of the market.
The random nature of the smaller eigenvalues is also indicated by an observation of the distribution of the corresponding eigenvector components. Note that, these components are normalized for each eigenvalue such that, , where is the -th component of the th eigenvector. For random matrices generated from uncorrelated time series, the distribution of the eigenvector components is given by the Porter-Thomas distribution,
As shown in Fig. 1 (right), this distribution fits the empirical histogram of the eigenvector components for the eigenvalues belonging to the bulk. However, the eigenvectors of the largest eigenvalues (e.g., the largest eigenvalue , as shown in the inset) deviate quite significantly, indicating their non-random nature.
The largest eigenvalue for the NSE cross-correlation matrix is more than 28 times larger than the maximum predicted by random matrix theory (RMT). The corresponding eigenvector shows a relatively uniform composition, with all stocks contributing to it and all elements having the same sign (Fig. 2, top). As this is indicative of a common component that affects all the stocks with the same bias, the largest eigenvalue is associated with the market mode, i.e., the collective response of the entire market to information (e.g., newsbreaks) laloux99 ; plerou99 .
Of more interest for understanding the market structure are the intermediate eigenvalues that occur between the largest eigenvalue and the bulk predicted by RMT. For the NYSE, it was shown that corresponding eigenvectors of these eigenvalues are localized, i.e., only a small number of stocks contribute significantly to these modes gopikrishnan01 ; plerou02 . It was also observed that, for a particular eigenvector, the significantly contributing elements were stocks that belonged to similar or related businesses (with the exception of the second largest eigenvalue, where the contribution was from stocks having large market capitalization). Fig. 2 shows the stocks, arranged into groups according to their business sector, contributing to the different intermediate eigenvectors very unequally333The significant contributions to the second largest eigenvalue were found to be from the stocks SBIN, SATYAMCOMP, SURYAROSNI, ITC, BHEL, NAGARFERT, ACC, GLAXO, DRREDDY and RANBAXY.. For example, it is apparent that Technology stocks contribute significantly to the eigenvector corresponding to the third largest eigenvalue. However, direct inspection of eigenvector composition for the deviating eigenvalues does not yield a straightforward interpretation of the significant group of stocks, possibly because the largest eigenmode corresponding to the market dominates over all intra-group correlations.
For more detailed analysis of the eigenvector composition, we use the inverse participation ratio (IPR), which is defined for the -th eigenvector as , where are the component of th eigenvector. For an eigenvector with equal components, , which is approximately the case for the eigenvector corresponding to the largest eigenvalue, . If, on the other hand, a single component has a dominant contribution, e.g., and for , we have . Therefore, IPR is inversely related to the number of significantly contributing eigenvector components. For the eigenvectors corresponding to eigenvalues of a random correlation matrix, . As seen from Fig. 3, the eigenvalues belonging to the bulk predicted by random matrix theory indeed have eigenvectors with this value of IPR. But, at the lower and higher end of eigenvalues, the market shows deviations from this value, suggesting the existence of localized eigenvectors444The deviations for the smallest eigenvalues indicate strong correlations between a few stocks (see Table 2).. These deviations are, however, much less significant and far fewer in number in the Indian market compared to developed markets, implying that while correlated groups of stocks do exist in the latter, their existence is far less clear in the NSE.
In order to graphically present the interaction structure of the stocks in NSE, we use a method suggested by Mantegna Mantegna99 to transform the correlation between stocks into distances to produce a connected network in which co-moving stocks are clustered together. The distance between two stocks and are calculated from the cross-correlation matrix , according to . These are used to construct a minimum spanning tree, which connects all the nodes of a network with edges such that the total sum of the distance between every pair of nodes, , is minimum. For the NYSE, such a construction has been shown to cluster together stocks belonging to the same business sector Onnela02 . However, as seen in Fig. 4, for the NSE, such a method fails to clearly segregate any of the business sectors. Instead, stocks belonging to very different sectors are equally likely to be found within each cluster. This suggests that the market mode is dominating over all intra-sector interactions.
Therefore, to be able to identify the internal structure of interactions between the stocks we need to remove the market mode, i.e., the effect of the largest eigenvalue. Also, the effect of random noise has to be filtered out. To perform this filtering, we use the method proposed in Ref. kim05 where the correlation matrix was expanded in terms of its eigenvalues and the corresponding eigenvectors : . This allows the correlation matrix to be decomposed into three parts, corresponding to the market, sector and random components:
where, the eigenvalues have been arranged in descending order (the largest labelled 0) and is the number of intermediate eigenvalues. From the empirical data, it is not often obvious what is the value of , as the bulk may deviate from the predictions of random matrix theory because of underlying structure induced correlations. For this reason, we use visual inspection of the distribution to choose , and verify that small changes in this value does not alter the results. The robustness of our results to small variations in the estimation of is because the error involved is only due to the eigenvalues closest to the bulk that have the smallest contribution to . Fig. 5 shows the result of the decomposition of the full correlation matrix into the three components. Compared to the NYSE, NSE shows a less extended tail for the sector correlation matrix elements . This implies that the Indian market has a much smaller fraction of strongly interacting stocks, which would be the case if there is no significant segregation into sectors in the market.
Next, we construct the network of interactions among stocks by using the information in the sector correlation matrix kim05 . The binary-valued adjacency matrix of the network is generated from by using a threshold such that if , otherwise. If the long tail in the distribution is indeed due to correlations among stocks belonging to a particular business sector, this should be reflected in a clustered structure of the network for an appropriate choice of the threshold. Fig. 6 shows the resultant network for the best choice of (= 0.09) in terms of creating the largest clusters of related stocks. However, even for the “best” choice we find that only two sectors have been properly clustered, those corresponding to Technology and to Pharmaceutical Companies. The majority of the frequently traded stocks cannot be arranged into well-segregated groups corresponding to the various business sectors they belong to. This failure again reflects the fact that intra-group correlations in most cases are much weaker compared to the market-wide correlation in the Indian market.
4 Time-evolution of the Correlation Structure
In this section, we study the temporal properties of the correlation matrix. We note here that if the deviations from the random matrix predictions are indicators of genuine correlations, then the eigenvectors corresponding to the deviating eigenvalues should be stable in time, over the period used to calculate the correlation matrix. We choose the eigenvectors corresponding to the 10 largest eigenvalues for the correlation matrix over a period to construct a matrix . A similar matrix can be generated by using a different time period having the same duration but a time lag compared to the other. These are then used to generate the overlap matrix = . In the ideal case, when the 10 eigenvectors are absolutely stable in time, would be a identity matrix. For the NSE data we have used time lags of = 6 months, 1 year and 2 years, for a time window of 5 years and the reference period beginning in Jan 1996. As shown in Fig. 7 the eigenvectors show different degrees of stability, with the one corresponding to the largest eigenvalue being the most stable. The remaining eigenvectors show decreasing stability with an increase in the lag period.
Next, we focus on the temporal evolution of the composition of the eigenvector corresponding to the largest eigenvalue. Our purpose is to find the set of stocks that have consistently high contributions to this eigenvector, and they can be identified as the ones whose behavior is dominating the market mode. We study the time-development by dividing the return time-series data into overlapping sets of length . Two consecutive sets are displaced relative to each other by a time lag . In our study, is taken as six months (125 trading days), while is taken to be one month (21 trading days). The resulting correlation matrices, , can now be analysed to get further understanding of the time-evolution of correlated movements among the different stocks.
In a previous paper sinha06 , we have found that the largest eigenvalue of follows closely the time variation of the average correlation coefficient. This indicates that the largest eigenvalue captures the behavior of the entire market. However, the relative contribution to its eigenvector by the different stocks may change over time. We assume that if a company is a really important player in the market, then it will have a significant contribution in the composition of over many time windows. Fig. 8 shows the 50 largest stocks in terms of consistently having large representation in . Note the existence of 5 companies from the Tata group and 3 companies of the Reliance group in this set. This is consistent with the general belief in the business community that these two groups dominate the Indian market, and may disproportionately affect the market through their actions.
In this paper, we have examined the structure of the Indian financial market through a detailed investigation of the spectral properties of the cross-correlation matrix of price returns. We demonstrate that the eigenvalue distribution is similar to that observed for developed markets of USA and Japan. However, unlike the latter, the Indian market shows much less evidence of the existence of business sectors having distinct identities. In fact, most of the observed correlation among stocks is due to effects common to the entire market, which has the effect of making the Indian market appear more correlated than developed markets. We hypothesise that the reason why emerging markets have been often reported to be significantly more correlated is because they are distinguished from developed ones in the absence of strong interactions between clusters of stocks in the former. This has implications for the understanding of markets as complex interacting systems, namely, that interactions emerge between groups of stocks as a market evolves over time to finally exhibit the clustered structure characterizing, e.g., the NYSE. How such self-organization is related to other changes a market undergoes as it develops is a question worth pursuing with the tools available to econophysicists. From the point of view of possible applicability, these results are of significance to the problem of portfolio diversification. With the advent of liberalization, there has been a significant flow of investment into the Indian market. The question of how investments can be made over a balanced portfolio of stocks so as to minimize risks assumes importance in such a situation. Our study indicates that schemes for constructing such optimized portfolios must take into account the fact that emerging markets are in general less differentiated and more correlated than developed markets.
|11||EICHERMOT||Automobiles Transport||71||NOCIL||Basic Materials|
|12||HINDMOTOR||Automobiles Transport||72||GOODLASNER||Basic Materials|
|13||PUNJABTRAC||Automobiles Transport||73||SPIC||Basic Materials|
|14||SWARAJMAZD||Automobiles Transport||74||TIRUMALCHM||Basic Materials|
|15||SWARAJENG||Automobiles Transport||75||TATACHEM||Basic Materials|
|16||LML||Automobiles Transport||76||GHCL||Basic Materials|
|17||VARUNSHIP||Automobiles Transport||77||GUJALKALI||Basic Materials|
|18||APOLLOTYRE||Automobiles Transport||78||PIDILITIND||Basic Materials|
|19||CEAT||Automobiles Transport||79||FOSECOIND||Basic Materials|
|20||GOETZEIND||Automobiles Transport||80||BASF||Basic Materials|
|21||MRF||Automobiles Transport||81||NIPPONDENR||Basic Materials|
Acknowledgements: We thank N. Vishwanathan for assistance in preparing the data for analysis and M. Marsili for helpful discussions.
- (1) Rouwenhorst K G (2005) The origins of mutual funds. In: Goetzmann, W N, Rouwenhorst, K G (eds) The Origins of Value: The financial innovations that created modern capital markets. Oxford Univ Press, New York.
- (2) Laloux L, Cizeau P, Bouchaud J P, Potters M (1999) Noise dressing of financial correlation matrices, Phys. Rev. Lett. 83: 1467–1470
- (3) Plerou V, Gopikrishnan P, Rosenow B, Amaral L A N, Stanley H E (1999) Universal and nonuniversal properties of cross correlations in financial time series, Phys. Rev. Lett. 83: 1471–1474
- (4) Gopikrishnan P, Rosenow B, Plerou V, Stanley H E (2001) Quantifying and interpreting collective behavior in financial markets, Phys. Rev. E 64: 035106
- (5) Plerou V, Gopikrishnan P, Rosenow B, Amaral L A N, Guhr T, Stanley H E (2002) Random matrix approach to cross correlations in financial data, Phys. Rev. E 65: 066126
- (6) Noh J D (2000) Model for correlations in stock markets, Phys. Rev. E 61: 5981–5982
- (7) Kim D-H, Jeong H (2005) Systematic analysis of group identification in stock markets, Phys. Rev. E 72: 046133
- (8) Utsugi A, Ino K, Oshikawa M (2004) Random matrix theory analysis of cross correlations in financial markets, Phys. Rev. E 70: 026110
- (9) Morck R, Yeung B, Yu W (2000) The information content of stock markets: Why do emerging markets have synchronous stock price movements?, J. Financial Economics 58: 215–260
- (10) Wilcox D, Gebbie T (2004) On the analysis of cross-correlations in South African market data, Physica A 344: 294–298; Wilcox D, Gebbie T (2007) An analysis of cross-correlations in an emerging market, Physica A 375:584–598
- (11) Kulkarni V, Deo N (2005) Volatility of an Indian stock market: A random matrix approach, physics/0512169
- (12) Jung W-S, Chaea S, Yanga J-S, Moon H-T (2006) Characteristics of the Korean stock market correlations, Physica A 361: 263–271
- (13) Cukur S, Eryigit M, Eryigit R (2007) Cross correlations in an emerging market financial data, Physica A 376: 555–564
- (14) Sinha S, Pan R K (2006) The power (law) of indian markets: Analysing NSE and BSE trading statistics, In: Chatterjee A, Chakrabarti B K (eds) Econophysics of Stock and Other Markets. Springer, Milan.
- (15) Pan R K, Sinha S (2007) Self-organization of price fluctuation distribution in evolving markets, Europhys. Lett. 77: 58004
- (16) Pan R K, Sinha S (2006) Inverse cubic law of index fluctuation distribution in Indian markets, physics/0607014
- (17) National Stock Exchange (2004) Indian securities market: A review. (http://www.nseindia.com/content/us/ismr2005.zip)
- (18) http://www.nseindia.com/
- (19) Sengupta A M, Mitra P P (1999) Distribution of singular values for some random matrices, Phys. Rev. E 60: 3389–3392
- (20) Mantegna R N (1999) Hierarchical structure in financial markets, Eur. Phys. J. B 11: 193–197
- (21) Onnela J-P, Chakraborti A, Kaski K, Kertesz J (2002) Dynamic asset trees and portfolio analysis, Eur. Phys. J. B 30:285–288