Recently, the CER team presented the results of the investigations into manipulation of trading volumes using wash trade. Based on scientific approaches and statistical analysis of the data, we conducted two separate pieces of research: Transaction Volume Analysis and Time Series Analysis to indicate the fraudulent activity of the 7 popular crypto exchanges. The summary of both researches you can find below.
The reasons for the investigation
One of the reasons restraining the development of the cryptocurrency market is fraud. An example of such malevolent behavior is wash trade. Wash trade is a form of market manipulation in which an investor or institution simultaneously sells and buys the same financial instruments to create misleading, artificial activity in the market and typically using large transactions/trading orders to reduce the risk of loss. Wash trade is strictly prohibited on classical financial markets.
Scope and Approaches of the Investigation
For the analysis, we selected data on BTC/USDT cryptocurrency pair transactions in the 2nd quarter of 2018 on 7 cryptocurrency exchanges such as Binance, OKex, HuobiPro, HitBTC, Bittrex, Poloniex, and KuCoin. Mind that significant data for some periods (up to about 10% of the duration of the quarter) was not available for two exchanges — Huobi and OKex.
The CER team applied two approaches to the study — transaction volume analysis described in the blog on September 3, 2018, and time series analysis described on September 11, 2018
Transaction Volume Analysis
For Transaction Volume Analysis, we used further methodology: for each cryptocurrency exchange, first, the volume of trades (VT) and the number of transactions (TxN) were calculated, as well as the ratio between them; secondly, transactions with abnormal volume were defined as outliers, identification of outliers was based on the use of the median as average value and inter-percentile range (IPR) between 90th and 10h percentile, transactions with a volume greater than the median by more than 3 IPRs were considered outliers; thirdly, each exchange’s average transaction volume (ATV) excluding outliers, was compared with the average transaction volume combined (ATVC), the average value of all exchanges’ transactions excluding outliers; fourth, solely outliers were analyzed, in particular, their share in VT, their rate in TxN and average outliers volume (AOV) was calculated, besides the lowest values for each exchange’s outliers were defined as outliers thresholds.
Steps of the Research
After VT and TxN computation, the ratio between them was calculated (Graph 3).
The ratio represented transaction volume for each exchange, and its values for OKex and HitBTC were higher than for other exchanges, which looked suspicious but did not yet prove trade manipulation.
But the similar results were received while comparing ATV of each exchange with ATVC (Graph 4)
HitBTC and OKex again stood out from others, as their ATV exceeded ATVC by 2.29 and 1.72 times respectively.
For outliers, the most insightful were the results of outliers threshold, and average outliers volume (AOV) calculation (Graphs 7 and 8).
Outliers threshold and AOV values for OKex were noticeably higher than for all others, particularly as much as 3.4 and 4 times higher than those for Binance respectively.
Results of Transactions Volume Analysis
As a result of the analysis of transactions volumes, CER team discovered that at least two crypto exchanges have suspicious results:
- HitBTC, with ATV value 2.29 times higher than average transaction value across all exchanges. This fact aligns with the idea that wash trade is usually performed by trades with the volume larger than the average volume of normal transactions, but requires additional, more detailed analysis to prove the use of the wash trades;
- OKex, with its average transaction volume 1.72 times higher than the aggregate average for the 7 exchanges, its minimal and average value for outliers which are 3.4 and 4 times higher than corresponding measures for Binance. All these findings suggest that OKex is very likely conducting volume manipulations, particularly wash trading.
Time series analysis
The purpose of the second research was to identify periodic changes in the volume traded (VT), which may indicate the intervention of fraudulent automated systems and the distortion of data on the volume traded using a wash trade. Since the wash trade is usually carried out with larger than average transaction volumes, the time series analysis was performed solely for outliers.
Methodology and Algorithm
To the Time Series Analysis (TSA) another approach was applied at the next stage of the study presented in the blog on September 11, 2018. Although TSA is usually used for modeling (forecasting) some future aspects based on historical data, In the context of the current investigation, TSA has been used only to spot certain components that are unnatural for fair financial markets. Particularly two functions were used:
- autocorrelation function (ACF), shows the correlation of the time series observations with values of the same series at previous times. One of the main purposes of its use is checking the data for randomness.
- the function of partial autocorrelation (PACF), is a summary of the relationship between an observation in a time series with separate observations at prior time steps. The difference between ACF and PACF is that the former shows the correlation of an observation with all previous observations within a certain time period and the latter displays the correlation between only two observations.
The study was based on the assumption that clean market data is supposed to be characterized by stochastic (random) movements, for example, it should not contain seasonal or cyclic components. While seasonal components are present in data showing regular fluctuations within fixed periods, cyclic components are not of the fixed period and usually have a longer duration than seasonal patterns.
The analysis had the following algorithm:
- each exchange’s VT curve was visualized;
- the data was divided into smaller periods with similar characteristics of the VT curve;
- each such period was analyzed:
- data for outliers was retrieved;
- ACF and PACF were calculated for data aggregated by different time frames with a different time, lags and then they were visualized by correlograms for the analysis.
On some analyzed correlograms CER team detected significant patterns suggesting that trade for those periods was not random. Such patterns included seasonal and cyclic components, as well as values or spikes standing out significantly from the confidence interval (blue area on correlograms). Graphs for some crypto-exchanges that most clearly show the results of the study are given below.
Graph 1 presents the Binance’s VT chart for the 2nd quarter of 2018 with 4 periods distinguished by similar VT curve characteristics.
Binance correlograms do not demonstrate any cyclic or seasonal components. But slowly decaying ACF without peaks in some periods (1st and 4th) indicate a presence of a trend, which does not necessarily imply fraudulent activity.
Graph 15 presents OKex’s VT chart for the 2nd quarter of 2018 with 5 periods distinguished by similar VT curve characteristics and one period (from April 29 to May 4) excluded due to the gap in data.
OKex’s ACF for the 1st period shows a minor cyclic pattern, pointing out that trade volume performance in the period may not be random.
ACFs for the 4th and 5th periods (graphs 17 and 18) demonstrate obvious seasonal component with 24-hours periodicity, which is clearly distinguishable for the former and diluted by the trend for the latter.
The appearance of such patterns indicates on cyclic trading activity most likely performed by automated programs (trade bots) aimed to manipulate trade volume.
Graph 19 presents the Poloniex’s VT chart for the 2nd quarter of 2018. The dataset was analyzed in whole since there were no obviously distinguishable periods with similar characteristics on VT curve.
Poloniex’s ACF display numerous non-periodical peak values exceeding the confidence interval on 4-hours data aggregation.
This fact signifies that the exchange’s trade volume from outliers is definitely not random, and should be analyzed more precisely to define the nature of such uncommon relations within data.
The results of TSA
Time series analysis applied for 7 exchanges showed that only two exchanges, Binance and KuCoin do not have any suspicious patterns. While minor cyclic components detected for Bittrex, HuobiPro, HitBTC, and OKex should be considered unnatural, there is the possibility that they might be normal volume performance depending on price fluctuations. In contrast to such patterns, ACF values which significantly stood out from the confidence interval (spotted for HitBTC, HuobiPro, and Poloniex), suggested that they were definitely not random but had an undefined nature that requires a more thorough analysis. But OKex was the one who showed the hardest evidence of trade volume manipulations. Its ACFs for two periods displayed 24-hours cyclic trading activity apparently performed by wash trading bots.
The investigation of the CER team clearly showed the effectiveness of applying statistical methods of data analysis to detect fraudulent activity in manipulating trading volumes on cryptocurrency exchanges. The application of the two approaches: the transaction volume analysis and the time series analysis allowed us to assert that out of 7 crypto-exchange exchanges, only on Binance and KuCoin the trading volume corresponds to the natural activity. As a result of the transaction volume analysis and TSA, the evidence of wash trade was found on OKEx. As for the remaining Bittrex, HuobiPro, HitBTC, and Poloniex, some signs of artificial formation of the volume of trades were found but requires additional research.
The Butterfly Effect or how to counter fake volumes and make the crypto industry mature
What You Need to Know About CER: Crypto Exchange Ranks Launch
Phishing of Crypto Hodlers, Traders, and Stock Exchanges
Decentralized vs Centralized Exchanges: Advantages and Disadvantages