The crypto industry is rapidly developing due to the innovative potential of blockchain technology as an opportunity to ensure transparency and fairness in the financial market, politics, and business. The most obvious outcome of blockchain progress is the disruption and evolution of traditional financial markets.
However, there remain various barriers hindering this progress. Cryptocurrency market is overcrowded with unfair players involved in various types of fraudulent activities. The most widespread of which is wash trading; a strictly prohibited practice in classical financial markets.
Undoubtedly, wash trading constitutes one of the main factors impeding the development of the crypto industry and future global tokenization. This manipulation contributes to an increasing negative image of the crypto industry.
WASH TRADE — THE MOST POPULAR METHOD OF TRADE VOLUME MANIPULATION
The fundamental institutions within the industry, TOP crypto exchanges, contribute to the negative image of the blockchain industry by manipulating the trade volume via Wash Trade. Wash trade is a form of market manipulation in which an investor or institution simultaneously sells and buys the same financial instruments to create misleading, artificial activity in the market. While it can be carried out in different ways, wash trade typically means using large transactions/trading orders to reduce the risk of loss.
The crypto market is relatively confined, meaning that even simple observations can spot large manipulations. An example of this can be seen in our previous research on BitForex Success Case. However, this time, we decided to apply a more scientific approach to effectively uncover fraud.
Key Findings: Results of the analysis showed that most of the investigated exchanges (except Binance and KuCoin) demonstrated that their trade volumes were not random. Some of them, including HuobiPro, HitBTC and especially Poloniex, showed outstanding autocorrelation values suggesting that their volume is not random (not organic trade volume) but is of an undefined nature. Obvious seasonal 24-hours components detected for OKex indicate a presence of distinctly artificial processes which are very likely aimed to manipulate trade volume by means of wash trade.
Methodology of Analysis
In the current study, we applied a time series analysis, in particular, autocorrelation and partial autocorrelation functions in order to detect cyclical and seasonal components in investigated data. If a data series has a type of trend, these functions allow us to spot it, however, they are not significant for our analysis.
Time series analysis (TSA) is usually used for modeling (forecasting) some future aspects based on historical data. Since there is no need to make any forecasts in the context of current investigations, we have used it only to spot certain components that are unnatural for fair financial markets. For the analysis we used following TSA tools:
- Autocorrelation Function (ACF) shows the correlation of the time series observations with values of the same series at previous times. One of the main purposes of its use is checking the data for randomness.
- Partial Autocorrelation Function (PACF) is a summary of the relationship between an observation in a time series with separate observations at prior time steps. The difference between ACF and PACF is that the former shows the correlation of an observation with all previous observations within a certain time period and the latter displays the correlation between only two observations.
The research is based on the assumption that clean market data is supposed to be characterized by stochastic (random) movements, for example, clean market data should not contain the seasonal or cyclic component.
- Seasonal component is supposed to exist when data show regular fluctuations within fixed periods.
- Cyclical component is supposed to exist when data show rises and falls that are not of fixed periods. The average length of cycles is greater than the length of a seasonal pattern.
Thus, the goal of this analysis is to investigate whether there are any periodical increases and decreases in the volume traded (VT) which may indicate the presence (turning on/off) of cheaters’ automated trading programs engaged in wash trade practices on the exchanges analyzed. Since the wash trade is usually carried out with transactions of larger than average volume, we focused our study solely on outliers, trades that lie outside the overall pattern of trade volume distribution or, simply put, all trades of much larger volume than average.
The analysis algorithm is as follows:
- Visualize the VT curve for each exchange’s trade data aggregated by different timeframes.
- Split the data into smaller portions distinguished by the VT curve’s characteristics that are similar for certain time periods.
- Investigate each part the of data:
- Extract outliers.
- Build ACF (Autocorrelation Function) and PACF (Partial Autocorrelation Function) for data aggregated by different timeframes with different time lags, and then visualize and analyze them.
For outliers extraction, we’ve used average value and inter-percentile range (more robust analog of standard deviation) for each portion of data separately. We calculated the average value of each sample as a median value and inter-percentile range (IPR) as the difference between 90th and 10th percentile. Trades with volume greater than the median by more than 3 IPR were considered as outliers.
Time Series Analysis
The scope of analysis – BTC/USDT pair in Q2 2018. We analyzed the trade data for 7 exchanges; namely, Binance, OKex, HuobiPro, HitBTC, Bittrex, Poloniex and KuCoin.
After having tried different timeframes for VT visualization, we determined 4-hour aggregation as the most appropriate for distinguishing periods with similar characteristics for all graphs.
Then, we built ACF and PACF graph for each period and showed those on which we detected any significant patterns indicating that the process was not of a random nature. Example of such patterns is seasonal and cyclic components, as well as values or spikes standing out significantly from the confidence interval (blue area). In statistics, a confidence interval is a range of values that contain a parameter of interest; in our case autocorrelation values which stand out from it are statistically significant for the analysis
Based on Binance VT curve visualization (graph 1), we distinguished the following periods as periods with different characteristics:
- From the 1st to the 30th of April;
- From the 1st to the 8th of May;
- From May 9th to June 9th;
- From the 10th to the 30th of June. Binance correlograms do not demonstrate any cyclic or seasonal components. But slowly decaying ACF without peaks in some periods (1st and 4th) indicate a presence of a trend, although this does not necessarily indicate fraudulent activity.
Therefore, there is no suspicious activity detected on Binance that can be revealed by our research.
Based on Bittrex VT curve visualization (graph 4) we distinguished the following periods:
- From the 1st to the 30th of April;
- From the 1st to the 31st of May;
From the 1st to the 30th of June. In the 1st period, Bittrex demonstrates a minor cyclic pattern on 12 hours aggregation with the time lag of 9 periods (9*12=108 hours).In the 2nd period, Bittrex demonstrates a minor cyclic component with slightly significant spikes on 12-hours aggregation with a lag of 5 periods (5*12=60 hours). While 2 minor cyclic components of different periodicity spotted for Bittrex should be considered unnatural, we can assume that they might be normal volume performance depending on price fluctuations.
Based on HitBTC VT curve visualization (graph 7) we distinguished the following periods:
- From the 1st to the 9th of April;
- From April 10th to May 17th;
- From the 18th to the 30th of May;In the 1st period, HitBTC demonstrates a minor cyclic pattern on 2 hours aggregation.Hitbtc’s ACF in the 3rd period shows a number of non-periodical but significant spikes.
While a minor cyclic component spotted in HitBTC should be considered unnatural, we can assume that it might be normal volume performance depending on price fluctuations. In addition, the ACF values that significantly outstand from the confidence interval detected for HitBTC suggests that they are definitely not random but have an undefined nature.
Based on HuobiPro VT curve visualization (graph 10) we distinguished the following periods:
- From the 1stl to the 8th of April;
- From the 9th to the 27th of April;
From May 6th to June 30th.
A period from April 28th to May 5th was excluded from analysis due to the gap in data (see assumptions).HuobiPro’s ACF on 3-hours data aggregation for the 1st period demonstrates trends along with cyclic components with lags of 8 periods (3*8=24 hours) which can be distinguished on the VT plot as well.HuobiPro’s ACF for the 3rd period displays some non-periodical but significant spikes. We suggest that a minor cyclic component spotted for Huobi PRO is normal volume performance based on price fluctuations. Moreover, the ACF values are not accidental but have an undefined nature.
On visualized KuCoin’s VT curve (graph 13), there are no obvious periods with similar characteristics to be distinguished. Therefore we analyzed the whole data series at once.KuCoin’s correlogram does not demonstrate any cyclic/seasonal components or trends. This means we did not detect any non-random and suspicious patterns.
Based on OKex‘s VT curve visualization (graph 15) we distinguished the following periods:
- From the 1st to the 7th of April;
- From the 8th to the 14th of April;
- From April 14th to May 1st;
- From the 2nd to the 28th of May;
- From the 5th to the 30th of June.
A period from April 29 till May 4 was excluded from the analysis due to a gap in data (see assumptions).Okex’s ACF for the 1st period shows a minor cyclic pattern, pointing to changes in trading volume on the exchange which may not be random. Okex’s ACF for the 3rd period displays obvious seasonal component with 24-hours periodicity on 1-hour data aggregation. These abnormalities mean that the trade volume (outlier transactions only) on OKex within this period of time is artificial. Moreover, it looks like this activity stems from an automated volume pamp. Slowly decaying ACF for the 5th period, along with apparent periodical peaks with a 24-hour cycle, indicates a presence of both trend and seasonal components. This is the same situation as the previous graphs, but the cyclic activity is diluted by the trend. To sum up, considering these graphs on the OKex trade volume from the outliers, we can make an inference that the cyclic wash trade activity was conducted on the OKex exchange in BTC\USDT pair. Most likely, the goal was to create simulated activity on the market to report the higher volume and attract more traders.
In turn, we analyzed the whole Q2 2018 data series for Poloniex, since there were no obvious periods with similar characteristic VT curves. Poloniex’s correlogram shows slowly decaying ACF with a lot of non-periodical but significantly outstanding (from confidence interval) values on 4-hours data aggregation. It signifies that trade volume from outliers is definitely not random on the exchange. Thus, we think that trade volume on Poloniex should be analyzed more precisely to define the nature of such uncommon relations within data, and hopefully to find more evidence of wash trade.
Based on the results of our analysis, we can claim that two exchanges do not have any suspicious patterns. We found nothing on KuCoin’s correlogram, and Binance demonstrated only the presence of a trend. But other exchanges showed the existence of non-random processes.
Bittrex has two minor cyclic patterns with different lags in two periods (1st and 2nd). HitBTC shows one minor cyclic pattern (1st period) and a period with significant spikes (3rd period). HuobiPro’s ACFs display a period with significant spikes (3rd period) and a combination of trend and cyclic components (1st period). Poloniex’s whole Q2 2018 ACF demonstrates that almost all values are significantly outstanding.
Finally, OKex is again the leader in raising red flags. It has three suspicious patterns: a minor cyclic component (1st period), a combination of trend and seasonal 24-hours component (5th period), and obvious seasonal 24-hours components (4th period).
Red Flags Detected
While minor cyclic components of different periodicity spotted for Bittrex, HuobiPro, HitBTC, and OKex should be considered unnatural, we can assume that they might be normal volume performance depending on price fluctuations. However, ACF values detected for HitBTC, HuobiPro, and Poloniex, significantly outstand from the confidence interval and suggest that they are definitely not random but have an undefined nature that requires a more thorough analysis.
Moreover, obvious seasonal 24-hours components detected for OKex indicate a presence of artificial processes. are very likely aimed at manipulating trade volume by means of wash trade. It’s clear that after earlier accusations in volume manipulations by Sylvain Ribes, starting in April OKex stopped doing it so obviously, with the use of the advanced tools, our sophisticated analysis revealed that the exchange has yet to put an end to malpractice, and has instead just learned how to disguise profoundly.
Since four out of seven observed (HuobiPro, OKex, Bittrex, and KuCoin) exchanges do not provide historical trade data via their API (see PS: Data Gathering Problems from previous research) we took the data from the sole source (CoinAPI) in order to be comparable. Datasets received from CoinAPI have a number of gaps of different length. In total we are missing significant data for two exchanges:
- Huobi – 9 days, 4 hours (~10% of Q2 2018)
- OKex – 8 days, 4 hours, 25 minutes (~9% of Q2 2018)
We consider Binance’s missing data of 1 day, 11 hours, and 25 minutes (~1.6%)insignificant as well other minor data gaps for Poloniex – 2 hours and 30 minutes (~0.114%) and Kucoin – 45 minutes (~0.034%).
Don’t hesitate to contact us via [email protected], if you have suggestions on how to make these reviews more interesting and effective.