Predictoor: AI-Powered Bots Get a UX Upgrade, with CLI & YAML

The pdr-backend v0.2 release has command-line interface & YAML file to set parameters, to run bots more easily

Published in

Ocean Protocol

11 min readJan 17, 2024

Contents
- Intro: Predictoor & Bots
- About v0.1 flows
- Challenges in v0.1 flows
- Introducing v0.2 
- How v0.2 fixes v0.1 challenges
- Conclusion

Summary

With Predictoor, you can run AI-powered prediction bots or trading bots on crypto price feeds to earn $. The interface to use predictoor bots & trader bots just got a lot simpler, via a CLI and using a YAML file for parameters. It also refactors backend code such to that we can do more powerful experiments around making $.

Get started at the pdr-backend’s README.

1. Intro: Predictoor & Bots

Ocean Predictoor provides on-chain “prediction feeds” on whether ETH, BTC, etc will rise in the next 5 min or 60 min. “Predictoors” submit predictions and stake on them; predictions are aggregated and sold to traders as alpha. Predictoor runs on Oasis Sapphire, the only confidential EVM chain in production. We launched Predictoor and its Data Farming incentives in September & November 2023, respectively.

The pdr-backend GitHub repo has the Python code for all bots: Predictoor bots, Trader bots, and support bots (submitting true values, buying on behalf of DF, etc).

As a predictoor, you run a predictoor bot with the help of a predictoor bot README in the pdr-backend GitHub repo. It takes 1–2 h to go through, including getting OCEAN & ROSE in Sapphire. The repo provides starting-point predictoor bots, which gather historical CEX price data and build AI/ML models. You can gain your own edge — to earn more $ — by changing the bot as you please: more data, better feature vectors, different modeling approaches, and more.

Similarly, as a trader, you can run a trader bot with the help of a trader bot README. The repo provides starting-point trader bots, which use Predictoor price feeds as alpha for trading. You can gain your own edge — to earn more $ — by changing the bot as you please for more sophisticated trading strategies.

Predictoor has promising traction, with 1.86M transactions and $1.86M in volume in the previous 30d [Ref DappRadar] [1].

Ocean Predictoor stats summary. Ref DappRadar Jan 17, 2024

Ocean Predictoor Volume vs time. Ref DappRadar Jan 17, 2024

Our main internal goal overall is to make $ trading, and then take those learnings to the community in the form of product updates, and related communications. Towards that, we’ve been eating our own dogfood: running our own predictoor bots & trader bots, and improving things based on our own experience. Most of these improvements come to Predictoor’s backend: the pdr-backend repo.

We’ve evolved it a lot lately! Where it mandates the first big release since launch (yet still pre-v1). That’s what this blog post describes.

The rest of this post is organized as follows. Section 2 describes the prior release (pdr-backend v0.1), and section 3 its challenges. Section 4 describes the new release (pdr-backend v0.2), focusing on its key features of CLI and YAML file, which help usability in running bots. Section 5 describes how v0.2 addresses the challenges of v0.1. Section 6 concludes.

2. About pdr-backend v0.1 Flows

We released pdr-backend when we launched Predictoor in September 2023, and have been continually improving it since then: fixing bugs, reducing onboarding friction, and adding more capabilities (eg simulation flow).

The first official release was v0.1.0 on November 20, 2023; with subsequent v0.1.x releases. It is licensed under Apache V2, a highly permissive open-source license.

In the last v0.1 predictoor bot README, the flow had you do simulation, then run a bot on testnet, then run a bot on mainnet. Let’s elaborate.

Simulation. You’d start simulation with a call like: python pdr_backend/simulation/runtrade.py. It grabs historical data, builds models, predicts, does simulated trades, then repeats, all on historical data. It logs and plots progress in real time. It would run according to default settings: what price feeds to use for AI model inputs, how much training data, etc. Those settings were hardcoded in the runtrade.py script. To change settings, you’d have to change the script itself, or support code.

A snippet of parameters from v0.1 runtrade.py script

Run a bot on testnet. First, you’d specify envvars via the terminal: your private key, envvars for network (e.g. RPC_URL), and envvars to specify feeds (PAIR_FILTER,TIMEFRAME_FILTER, SOURCE_FILTER). Then you’d run the bot with a call like:python pdr_backend/predictoor/main.py 3. It would run according to default settings. The 3meant predictoor approach #3: dynamic model building. To change predictoor settings, you’d have to change code.

Run a bot on mainnet. This was like testnet, except specifying different envvars for network and perhaps private key.

Any further configurations, such as what CEX data sources to use in modeling, would be hardcoded in the script. To use different values, you’d have to modify those in your local copy of the code.

The last v0.1 trader bot README had a similar flow to the v0.1 predictoor bot README.

3. Challenges in pdr-backend v0.1 Flows

We were — and are — proud of the v0.1 predictoor bot & trader bot flows. We’d streamlined them fairly well: one could get going quickly, and accomplish what they needed to. To go further and modify parameters, one would have to jump into Python code. At first glance this might have thought this a problem; however target users (and actual users) are data scientists or developers, who have no trouble modifying code.

Yet there were a few issues. First, it was annoying to manually change code to change parameters.

We could have written higher-level code that looped, and modified the parameters code at each loop iteration; however code that changes code is error-prone and can be dangerous.
Trader bots and predictoor bots had the same issue, and worse: the py code for parameter changes was scattered in a few places. Even if the scattering was fixed, the core issue would remain.

Second, envvars didn’t have enough fidelity, and adding more would have led to an unusably high number of envvars.

Recall that we used envvars to specify feeds (PAIR_FILTER, etc). This wasn’t enough detail for all our needs. For example, in running a predictoor bot, one couldn’t use envvars to specify the model output feed (what feed to predict) and model input price feeds, let alone non-price feeds for model inputs.
And, putting it into envvars would be sloppy and error-prone; if we weren’t careful, we’d have a crazy number of envvars.

Third, a messy CLI was organically emerging.

Recall, one would run a predictoor bot with a custom call directly to the script, such as:python pdr_backend/predictoor/main.py 3, where 3meant approach 3. Similar for simulation or trader flows.
Support for CLI-level parameters was pretty basic, only lightly tested, and was implemented on a per-script basis. Then, from our own runs of predictoor bots we were starting to do basic analytics, and a new ./scripts/directory emerged, with each script having its own custom CLI call. Things were getting messier yet.

Finally, we wanted to extend pdr-backend functionality, and doing it in v0.1 code would explode complexity.

We have big plans for our “make $” experiments, and for these, we saw the need to extend functionality by a lot.
We wanted to build out a proper analytics app. We wanted to professionalize and operationalize the data pipeline, for use by simulation, the bots, and the analytics app.
We wanted to extend simulation, into a flow that supported experiments on realtime data and with the possibility of live trading. Doing this would have means even more parameters and flows; if we kept the v0.1 parameter-setting and messy CLI then complexity would become unwieldy. We needed a cleaner base before we could proceed.

4. Introducing pdr-backend v0.2

We’re pleased to announce the release of pdr-backend v0.2. It solves the issues above 🎉 via a good CLI, and a YAML file to set parameters. It’s live now in the pdr-backend repo.

The rest of this section describes the heavily-updated CLI, the YAML file, and changes to the pdr-backend architecture for a good data pipeline and analytics.

4.1 Updated CLI

You get started with Predictoor like before:

git clone https://github.com/oceanprotocol/pdr-backend
cd pdr-backend
python -m venv venv # create & activate virtual env't
source venv/bin/activate
pip install -r requirements.txt # install modules
export PATH=$PATH:. # add pwd to path

Then, you can type pdr to see the interface at the command-line:

$ pdr
Predictoor tool

Usage: pdr sim|predictoor|trader|..

Main tools:
  pdr sim PPSS_FILE
  pdr predictoor APPROACH PPSS_FILE NETWORK
  pdr trader APPROACH PPSS_FILE NETWORK
  pdr lake PPSS_FILE NETWORK
  pdr claim_OCEAN PPSS_FILE
  pdr claim_ROSE PPSS_FILE

Utilities:
  pdr help
  pdr <cmd> -h
  pdr get_predictoors_info ST END PQDIR PPSS_FILE NETWORK --PDRS
  pdr get_predictions_info ST END PQDIR PPSS_FILE NETWORK --FEEDS
  pdr get_traction_info ST END PQDIR PPSS_FILE NETWORK --FEEDS
  pdr check_network PPSS_FILE NETWORK --LOOKBACK_HOURS

Transactions are signed with envvar 'PRIVATE_KEY`.

Tools for core team:
  pdr trueval PPSS_FILE NETWORK
  pdr dfbuyer PPSS_FILE NETWORK
  pdr publisher PPSS_FILE NETWORK
  pdr topup PPSS_FILE NETWORK
  pytest, black, mypy, pylint, ..

There are commands to run experiments / simulation (pdr sim), predictoor bot (pdr predictoor), trader bot (pdr trader), and for people running predictoor bots to claim rewards (pdr claim_OCEAN, pdr claim_ROSE).

There’s a new command to fill the data lake (pdr lake), and several new analytics-related commands (pdr get_predictoors_info , …, pdr check_network ). Remaining commands are typically for use by the core team.

To get help for a given command, just type the command without any argument values.

4.2 New: YAML file

The default file is ppss.yaml. Most CLI commands take PPSS_FILE (YAML file) as an input argument. Therefore users can make their own copy from the default ppss.yaml, and modify at will.

The YAML file holds most parameters; the CLI specifies which YAML file and network, and sometimes commonly-updated parameters.

To minimize confusion, there are no envvars. All parameters are in the YAML file or the CLI. One exception: PRIVATE_KEY envvar is retained because putting it in a file would have reduced security.

The YAML file has a sub-section for each bot: a predictoor_ss section, a trader_ss section, etc. The web3_pp section holds info for all networks.

Below is an an example of the predictoor_ss section in the YAML file. Note how it specifies a feed to predict (predict_feed), as well as input feeds for AI modeling (aimodel_ss.input_feeds).

predictoor_ss:
  predict_feed: binance BTC/USDT c 1h
  bot_only:
    s_until_epoch_end: 60 # in s. Start predicting if there's > this time left
    stake_amount: 1 # stake this amount with each prediction. In OCEAN
  aimodel_ss:
    input_feeds:
      - binance BTC/USDT c 1h
#      - binance BTC/USDT ETH/USDT BNB/USDT XRP/USDT ADA/USDT DOGE/USDT SOL/USDT LTC/USDT TRX/USDT DOT/USDT ohlcv 1h
#      - kraken BTC/USDT 1h
    max_n_train: 5000 # no. epochs to train model on
    autoregressive_n : 10 # no. epochs that model looks back, to predict next
    approach: LIN

Most CLI commands take NETWORK as an input argument. The YAML file holds RPC_URL and other network parameters for each network. Combining this, the NETWORK CLI argument selects from them. Therefore to wants to use a different network (e.g. testnet → mainnet), then one only needs to change the network name in the CLI. Compare this to v0.1 where several envvars needed changing. A bonus: the new setup allows convenient storage of many different network configurations (in the YAML file).

When the whole YAML file is read, it creates a PPSS object. That object has attributes corresponding to each bot: a predictoor_ss object (of class PredictoorSS), a trader_ss object (of class TraderSS), etc. It also holds network info in its web3_pp object (of class Web3PP).

4.3 New: Good data pipeline

We refined pdr-backend architecture to have a proper data pipeline, in new directory /lake/. It’s centered around a data lake with tiers from raw → refined. We’ve moved from storing raw price data as csv files, to parquet files, because parquet supports querying without needing to have a special database layer on top (!), among other benefits.

In conjunction, we’ve moved from Pandas dataframes to Polars dataframes, because Polars scales better and plays well with parquet. (We are already seeing intensive data management and expect our data needs to grow by several orders of magnitude.)

4.4 New: Space to grow analytics

We’ve also updated pdr-backend analytics support, in the new directory /analytics/ . First, what used to be ad-hoc scripts for analytics tools now has proper CLI support:pdr get_predictoors_info , …, pdr check_network. These analytics tools now use data from the lake, and continue to be evolved. Furthermore, we are building towards a more powerful analytics app that uses python-style plotting in the browser, via streamlit.

5. How pdr-backend v0.2 Fixes v0.1 Issues

Here’s how v0.2 fixes each of the four issues raised above.

Issue: Annoying to manually change code to change parameters
v0.2 fix: use YAML file & CLI for all parameters. The YAML file holds most parameters; the CLI specifies which YAML file and network, and sometimes commonly-updated parameters. The YAML file holds parameters that were previously envvars, or somewhere in code. Here’s the default YAML file.
Issue: envvars didn’t have enough fidelity
v0.2 fix: use YAML file & CLI for all parameters. In the YAML file, each bot gets its own subsection, including which feeds to work with. The YAML has far more fidelity because it also includes variables that were previously in code.
Issue: a messy CLI was organically emerging
v0.2 fix: now we have a clean CLI. Previous calls to scripts for simulation, predictoor bot, trader bot, and various analytics are all now folded into the CLI. The CLI is implemented in new directory /cli/; its core modules cli_arguments.py and cli_module.py use argparse, the best-practices CLI library for Python. The CLI has unit tests and system tests.
Issue: we wanted to extend pdr-backend functionality, and doing it in v0.1 code would explode complexity.
v0.2 fix: YAML & clean CLI give a less-complex, more flexible foundation to build from. And we’re now nicely along in our progress: as we were building v0.2, we have also refined its architecture to have a proper data pipeline (in /lake/), the beginnings of a more powerful analytics app (in /analytics/), and are about to upgrade the simulation engine for more flexible and powerful experiments.

6. Conclusion

With Ocean Predictoor, you can run AI-powered prediction bots or trading bots on crypto price feeds to earn $. With pdr-backend v0.2, the interface to use predictoor bots & trader bots just got a lot simpler, via a CLI and using a YAML file for parameters. It also refactors backend code such to that we can do more powerful experiments around making $.

Get started at the pdr-backend’s README.

Notes

[1] Two 1.86M values is a coincidence. Usually the values aren’t identical, though typically within 0.5x — 2x of each other

Final Note

None of the content in this post should be taken as financial advice. Everything you do is your responsibility, at your discretion.

About Predictoor & Ocean Protocol

In Ocean Predictoor, people run AI-powered prediction bots or trading bots on crypto price feeds to earn $. Follow Predictoor on Twitter, and get support in discord. Track progress on GitHub at pdr-backend and more. Predictoor runs on Oasis Sapphire confidential EVM chain.

Predictoor is powered by Ocean Protocol, which provides tools to privately & securely publish, exchange, and consume data. Follow Ocean on Twitter or Telegram.