Bitcoin Dataset

Catalog

Bitcoin Partial Transaction Dataset

Introduction

The  Bitcoin  Partial  Transaction  Datasets  contain  three  snapshots  of  Bitcoin  transaction  data  for  easier  analysis,  namly    dataset1_2014_11_1500000,  dataset2_2015_6_1500000  and  dataset3_2016_1_1500000.  We  sample  the  snapshots  from  November  2014  to  January  2016  with  six  months  as  the  sampling  interval.  Each  snapshot  contains  the  first  1,500,000  transaction  records  in  its  corresponding  month,  namly  Nov.  2014,  Jun.  2015  and  Jan.  2016.

We  also  provide  a  file  including  the  labeled  addresses  belonging  to  mixing  services,  and  these  addresses  were  active  during  the  observing  time  of  our  snapshots.

Due  to  the  pseudonymous  requirements  of  Bitcoin,  it  is  unlikely  to  enforce  Know-Your-Customer  (KYC)  processes,  which  are  guidelines  in  anti-money  laundering.  However,  mixing  services  in  Bitcoin,  originally  designed  to  enhance  transaction  anonymity,  have  been  widely  employed  for  money  laundry  to  complicate  trailing  illicit  fund. 

In  our  work,  we  study  mixing  service  detection  with  this  dataset.  For  further  study,  we  can  chase  up  users  involved  in  criminal  activities  by  analyzing  users  who  take  part  in  Bitcoin  mixing.

The  details  of  dataset1_2014_11_1500000  are  described  below.  The  file  structure  of  dataset2_2015_6_1500000  and  dataset3_2016_1_1500000  are  similar  to  EthereumG1.  You  can  know  more  information  from  the  README  file.

Data details

About this table
Information of block.
Columns (4 columns)
blockID Block ID
bhash Block hash (identifier in the blockchain)
btime Creation time of block
txs Number of transactions
About this table
Transaction ID and hash pairs.
Columns (2 columns)
txID Transaction ID
txhash Transaction hash
About this table
Bitcoin address ID and address pairs.
Columns (2 columns)
addrID Address ID
addr String representation of the address
About this table
Information of transaction.
Columns (5 columns)
txID Transaction ID
blockID Block ID
n_inputs Number of inputs
n_outputs Number of outputs
btime Creation time
About this table
List of all transaction inputs.
Columns (3 columns)
txID Transaction ID
addrID Sending address
value Integer sum in Satoshis (1e-8 BTC)
About this table
List of all transaction outputs.
Columns (3 columns)
txID Transaction ID
addrID Receiving address
value Integer sum in Satoshis (1e-8 BTC)
About this table
label.rar contains 38 files, the file structure of each file can refer to BitMixer.io.csv. These files contain addresses of BitMixer.io, BitLaunder.com, BitcoinFog, HelixMixer crawled from walletexplorer.
Columns (1 column)
Address Address belonging to this service (separated by ‘\n’)

Citation

BibTeX

@misc{wu2020detecting,
    title={Detecting Mixing Services via Mining Bitcoin Transaction Network with Hybrid Motifs},
    author={Jiajing Wu and Jieli Liu and Weili Chen and Huawei Huang and Zibin Zheng and Yan Zhang},
    year={2020},
    eprint={2001.05233},
    archivePrefix={arXiv},
    primaryClass={cs.SI}
}

IEEE

J. Wu, J. Liu, W. Chen, H. Huang, Z. Zheng, and Y. Zhang, “Detecting Mixing Services via Mining Bitcoin Transaction Network with Hybrid Motifs,” ArXiv Preprint ArXiv:2001.05233, 2020.

ACM

Jiajing Wu, Jieli Liu, Weili Chen, Huawei Huang, Zibin Zheng, and Yan Zhang, “Detecting Mixing Services via Mining Bitcoin Transaction Network with Hybrid Motifs,” ArXiv Preprint ArXiv:2001.05233, 2020.

Mt.Gox Leaked Transaction

Introduction

This data set is the transaction data leaked by mt.gox exchange. First, we combine the buy and sell transaction fields of the same transaction, and then de duplicate them through transaction time, transaction account, etc. to ensure the uniqueness of each transaction data. This transaction data is very useful for analyzing the user behavior of bitcoin market. 

We have done a market manipulation study using this data set. You can see related research for details.

Data details

About this table
Transactions of bitcoin market.
Columns (8 columns)
Source The user who sell bitcoins
Target The user who buy bitcoins
Trade_Id The ID of present trasaction
Bitcoins Number of bitcoins involved in the current transaction
Money Dollars spent buying bitcoins
Money_rate Price per bitcoin
Date Date of transaction
label Types of users

Citation

BibTeX

@inproceedings{chen2019market, 
title={Market Manipulation of Bitcoin: Evidence from Mining the Mt. Gox Transaction Network},
author={Chen, Weili and Wu, Jun and Zheng, Zibin and Chen, Chuan and Zhou, Yuren},
booktitle={IEEE Conference on Computer Communications},
pages={964--972},
year={2019},
organization={IEEE}
}

IEEE

W. Chen, J. Wu, Z. Zheng, C. Chen, and Y. Zhou, “Market Manipulation of Bitcoin: Evidence from Mining the Mt. Gox Transaction Network,” Proc. - IEEE INFOCOM, vol. 2019-April, no. April 2011, pp. 964–972, 2019, doi: 10.1109/INFOCOM.2019.8737364.

ACM

Weili Chen, Jun Wu, Zibin Zheng, Chuan Chen, and Yuren Zhou. 2019. Market Manipulation of Bitcoin: Evidence from Mining the Mt. Gox Transaction Network. Proceedings - IEEE INFOCOM 2019-April, April 2011: 964–972. https://doi.org/10.1109/INFOCOM.2019.8737364

Bitcoin Price and Volume Dataset

Introduction

This is the market data of Bitcoin in terms of price and volume from August 2015 (when Ether first appeared) to March 2019. The time interval of sampling is selected as four-hour, that is to say, we choose every kinds of price and volume every of four-hour as the original data.

The original market data of Bitcoin are obtained from Poloniex,one of the most active crypto asset exchanges.

Data details

About this table
Market data about Bitcoin as the exchange rate is BTC/USDT.
Columns (8 columns)
close The close price in the period
date The timestamp in the beginning of this period
high The highest price in the period
low The lowest price in the period
open The open price in the period
quoteVolume The quote volume in the period
volume The base volume in the period
weightedVolume The average price for those base volume and quote volume

Citation

BibTeX

@article{han2020long,
title={Long-range dependence, multi-fractality and volume-return causality of Ether market},
author={Han, Qing and Wu, Jiajing and Zheng, Zibin},
journal={Chaos: An Interdisciplinary Journal of Nonlinear Science},
volume={30},
number={1},
pages={011101},
year={2020},
publisher={AIP Publishing LLC}
}

IEEE

Q. Han, J. Wu and Z. Zheng, “Long-range dependence, multi-fractality and volume-return causality of Ether market,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 30, no. 1, pp. 011101, 2020.

ACM

Qing Han, Jiajing Wu and Zibin Zheng, “Long-range dependence, multi-fractality and volume-return causality of Ether market,” Chaos: An Interdisciplinary Journal of Nonlinear Science, vol. 30, no. 1, pp. 011101, 2020.