Agent-Based Single Cryptocurrency Trading Challenge
FinNLP-FNP-LLMFinLegal @ COLING-2025
January 19-20 2025, Abu Dhabi, UAE
Introduction
Background: Cryptocurrency trading markets present a dynamic and challenging environment, characterized by extreme sensitivity to diverse, time-sensitive information and high volatility. These markets are influenced by a diverse array of data streams, ranging from real-time price data to long-term economic reports, each with varying degrees of immediacy and impact. Recent advancements in Large Language Models (LLMs) have led to the development of autonomous agents capable of processing this complex, multi-timeliness information landscape, addressing the growing demand in finance for automated systems that can transform vast real-time data into decisions while integrating information across various time horizons.
Goal: The CryptoTrading competition provides a platform to evaluate your created specialized LLMs' proficiency in daily cryptocurrency trading. Through pre-training or domain-specific fine-tuning, these models are expected to demonstrate enhanced comprehension and reasoning abilities, enabling them to respond effectively to volatile and complex cryptocurrency market conditions.
Here, we utilize FinMem [1], an integrated LLM-based agent framework designed for trading different types of financial assets, as the evaluation framework for the created LLMs on automated cryptocurrency trading. Under the FinMem framework, the submitted LLMs will be used as backbone models to support the essential functions of FinMem in making sequential cryptocurrency trading decisions. Through the evaluation, the capability enhancement on tacking challenging questions in financial asset trading for pre-trained LLMs will be examined from the following perspectives:
Domain-specific knowledge comprehension: The submitted LLMs should demonstrate a solid understanding of cryptocurrency fundamentals, including basic characteristics, trading principles, market dynamics, and current trends.
Awareness of varying timeliness in multi-source financial data: The fine-tuning process should enable LLMs to recognize and appropriately weigh the different time sensitivities of various financial data sources. For example, daily cryptocurrency news may have a more immediate but shorter-lived impact compared to longer-term market indicators or regulatory changes.
Robust reasoning for sustained multi-turn financial decision-making: The fine-tuned LLMs will be back-tested over an approximately 6-month period on a daily basis. However, their practical applications may extend to lower-frequency and longer-term asset tasks. Therefore, it's crucial that these models can consistently deliver high-quality trading decision sequences by integrating multi-source information. This sustained performance is essential for LLMs to serve effectively as the backbone of financial language agents.
We sincerely welcome students, researchers, and engineers passionate about financial technology, LLMs, and language agent design to participate in this challenge!
Task
Overview
Task Description
In the task, the participants are asked to provide a pre-trained/fine-tuned LLM to perform daily single cryptocurrency trading (Bitcoin and Ethereum), which will be evaluated under the FinMem Agent framework. The participants can use but are not limited to the data provided below and were encouraged to deploy the FinMem from the GitHub repo to evaluate and select the training checkpoints. After pre-training/fine-tuning your model, you can evaluate its performance on FinMem using the training data. If you are satisfied with the results, you may upload your model to the Huggingface for testing. We will use the models submitted by participants for the final evaluation on the test set. The evaluation pipeline is detailed in the section below.
Step 1: Pre-training/fine-tuning your specific model (Fine-tuning Example).
Step 2: Upload your model to Huggingface (Document).
Step 3: Validate your model under the FinMem framework and then (Start kit).
Evaluation Pipeline
We aim to evaluate the LLM’s trading capability of cryptocurrency, measuring its performance by profitability. At each time step, the LLM is provided with a feed of memories retrieved from the memory module. Based on this information, the LLM must make an investment decision —choosing from buy, sell, or hold—while providing the corresponding reasoning and specifying the index of the information supporting the decision. The importance of information leading to profitable trading decisions will be increased to enhance its likelihood of being sampled in the future. Conversely, the importance of information resulting in investment losses will be decreased to reduce its probability of being sampled.
To evaluate the fine-tuned LLMs, participants can use FinMem to assess their models' performance on the validation dataset. The final competition rankings will be determined based on FinMem's trading performance using the fine-tuned models on a separate blind test set. To better understand FinMem's working mechanism, here's a brief overview of its workflow:
Dataset
Ticker Introduction
BTC: The ticker symbol "BTC" stands for Bitcoin, which is a decentralized digital currency. Bitcoin operates on a technology called blockchain, which is a public ledger containing all transaction data from anyone who uses Bitcoin.
ETH: The ticker symbol "ETH" represents Ethereum, which is both a cryptocurrency and a blockchain platform. Ethereum is notable for its function as a platform for decentralized applications (DApps) and smart contracts.
Cryptocurrency Market News Data
The dataset offers two elements for this challenge:
Cryptocurrency to USD exchange rates (floating-point values)
News articles (textual data)
We provide the JSON format data. Below is a preview of the dataset provided.
Data Source:
Crypto News Recent: This dataset consists of news_url, title: text, source name, and date.
Practice Set and Validation Set Release:
Practice Set link: https://drive.google.com/drive/folders/1Hg_Ee-5NwSy8vdA5eMsTqEAE02w92-zs?usp=sharing
Practice Data Period: from 2022-11-29 to 2023-01-02
Training set link: https://drive.google.com/drive/folders/1fr0nBUhpJ0BIo_rukGPWa9skX4Fj_FeY?usp=sharing
Training Data Period: from 2023-02-13 to 2023-04-02
The practice portion will be made available to participants, while the test segment will be reserved for internal evaluation.
Evaluation Metrics
We offer a comprehensive assessment of profitability, risk management, and decision-making prowess by Sharpe Ratio (SR). The SR will be calculated by averaging two tickers.
[1] FinMem: A Performance-Enhanced LLM Trading Agent with Layered Memory and Character Design (https://arxiv.org/pdf/2311.13743)
Important Dates
(Released!) Practice set release: 2 September 2024
(Released!) Training set release: 16 September 2024
(Pending) Systems submission due: 7 November 2024 11 November 2024 (Submission Your Model Here!)
(Pending) Release of results: 12 November 2024
(Pending) Paper Submission Deadline: 25 November 2024
(Pending) Notifications of Acceptance: 5 December 2024
(Pending) Camera-ready Paper Deadline: 13 December 2024
(Pending) Workshop Date: 19-20 January 2025
Registration
We invite academic institutions, industry professionals, and individual to participate in this exciting competition. Please register your team using the following link:
Registration link for Agent-Based Single Cryptocurrency Trading Challenge
Please choose a unique team name and ensure that all team members provide their full names, emails, institutions, and the team name. Every team member should register using the same team name. We encourage you to use your institutional email to register.
Submission
Model Submission:
https://docs.google.com/forms/d/14rEJWa8oSbB_w-pFX875d8OTz1t7YtmcCgw9Sp3R9Tw/edit
System Paper Submission:
The ACL Template MUST be used for your submission(s). Accepted papers proceedings will be published at ACL Anthology.
The paper title format is fixed: "[Model or Team Name] at the Crypto Trading Challenge Task: [Title]".
The reviewing process will be single-blind. Accepted papers proceedings will be published at ACL Anthology.
Shared task participants will be asked to review other teams' papers during the review period.
Submissions must be in electronic form using the paper submission software linked above.
At least one author of each accepted paper should register and present their work in person in FinNLP-FNP-LLMFinLegal. Papers with “No Show” may be redacted. Authors will be required to agree to this requirement at the time of submission. It's a rule for all Coling-2025 workshops.
Leaderboard
Task Organizers
Prof. Jordan W. Suchow
Stevens Institute of Technology
Yangyang Yu
Stevens Institute of Technology
Haohang Li
Stevens Institute of Technology
Yupeng Cao
Stevens Institute of Technology
Keyi Wang
Columbia University, Northwestern University
Zhiyang Deng
Stevens Institute of Technology
Zhiyuan Yao
Stevens Institute of Technology
Yuechen Jiang
Stevens Institute of Technology
Dong Li
FinAI
Ruey-Ling Weng
Yale University
Contact
Contestants can communicate any questions on Discord in the #coling2025-finllm-workshop channel.
Contact email: finllmagent@gmail.com