Skip to content

New Dataset Added to the AWS Public Blockchain Data: TON (The Open Network)

4 minute read
Content level: Foundational
0

Learn how to access and analyze TON blockchain data using AWS analytics services like Amazon Athena, SageMaker, and Bedrock.

Author: Pavel Shuvalov, Analytics Lead at TON Foundation

Today, we bring you TON (The Open Network) dataset, now available in the AWS Public Blockchain Datasets. This new dataset provides researchers, developers, and analysts with free access to comprehensive data from TON blockchain, enabling advanced analytics and research to drive innovation in the blockchain ecosystem.

About the TON Dataset

The TON dataset is generated by the TON-ETL project, developed by TON Studio, a team dedicated to empowering developers within TON ecosystem. TON-ETL project is hosted on AWS, leveraging services such as Amazon S3, Amazon RDS, Amazon Athena, and Amazon EKS for robust data storage, processing, and analysis. The data is stored as Parquet files in Amazon S3, partitioned by date to optimize query performance, making it ideal for large-scale analytics within the AWS Analytics ecosystem.

The dataset includes a rich set of tables, such as:

  • Blocks and Transactions
  • Messages: Since TON is asynchronous, every operation consists of several messages with payloads.
  • Jetton Events: Records of jetton (TON's token standard) transfers, burns, and mints.
  • NFT Items and Transfers: Data on non-fungible tokens, including ownership and transfer history.
  • DEX Trades and pools: Decentralized exchange trade data, including swap events and pool activities.

These tables provide a comprehensive view of the TON blockchain, enabling users to explore transaction volumes, token movements, NFT activities, and more.

Accessing the Dataset

The TON dataset is accessible via Amazon S3 at s3://aws-public-blockchain/v1.1/ton/ and updated daily.

Analyzing the Data

TON’s dataset can be analyzed using AWS services like Amazon Athena, Amazon SageMaker, Amazon QuickSight and Amazon EMR, which allow users to gain insights from the data. For instance, you can use Amazon Athena to perform queries such as:

  • Identifying the top jetton transfer volumes over a specific period.
  • Analyzing transaction patterns to detect trends or anomalies.
  • Tracking NFT sales and ownership changes across the TON network.

Before consuming the data on AWS, you need to create corresponding tables in the AWS Glue Data Catalog. This can be done using simple AWS CLI commands (use this article to get a comprehensive guide on the commands) or with the help of the chat-with-blockchain-data-with-amazon-bedrock project. It contains CDK-powered instructions to create all required tables. It also prepares a “text to SQL” Amazon Bedrock Agent to analyze the data in an AI-powered way.

Here’s a sample SQL query using Amazon Athena to calculate DEX trading volume for the last 30 days:

SELECT 
 SUM(volume_usd) AS total_volume
FROM ton.dex_trades
WHERE 
CAST(date AS date) >= NOW() - INTERVAL '30' DAY

Writing a complex SQL query can be time-consuming, so you can alternatively utilize a Bedrock Agent to work with the data faster. For example, you can ask an agent for top NFT collections on TON: TON Text 2 SQL

More examples are available here.

Additional Resources

For more information on the AWS Public Blockchain Datasets, refer to the OpenData Registry or this blog post. . To learn more about the TON-ETL project and its data schemas, check out the TON-ETL GitHub Repository and TON data models. For advanced use cases, such as integrating TON data with Amazon Bedrock for natural language querying, refer to the chat-with-blockchain-data-with-amazon-bedrock repository.

Conclusion

The addition of TON’s dataset to the AWS Public Blockchain Data program marks a significant step in making TON Blockchain data more accessible to a global audience. Whether you’re a researcher studying blockchain trends, a developer building Web3 applications, or an analyst exploring DeFi and NFT markets, this dataset provides a powerful resource for your work. We invite you to explore TON’s dataset and leverage AWS’s robust analytics tools to uncover new insights.

Become a Data Provider

We welcome additional blockchain data providers to join this initiative. If you're interested in contributing datasets to the AWS Public Blockchain Data program, please contact our team at aws-public-blockchain@amazon.com.