Streaming JSONL data#

In this example, we demonstrate how to stream JSONL data from a URL and parse it into a dataframe and visualise the data. We will use the XSGD-USDC pair on Uniswap V3 token as an example.

JSON lines#

JSON Lines is a convenient format for storing structured data that may be processed one record at a time. This is preferred method for getting large amounts of OHLCV data due to the ease of incremental processing and lower memory footprint

JSON Lines is like JSON, but each new line (record) is a new JSON object. The basic structure of JSON Lines is very simple:

{"name": "John", "age": 30, "city": "New York"}
{"name": "Jane", "age": 25, "city": "Chicago"}
{"name": "Jim", "age": 35, "city": "Los Angeles"}

In the above example, each line is a valid JSON object, and each line can be parsed independently of the other lines. This property makes JSON Lines ideal for data storage where new records are appended over time, such as log files. It’s also good for data processing tasks that can be carried out in a line-by-line manner, such as Unix ‘grep’, ‘sed’, or ‘awk’.

In comparison, a regular JSON data would look like:

[
  {"name": "John", "age": 30, "city": "New York"},
  {"name": "Jane", "age": 25, "city": "Chicago"},
  {"name": "Jim", "age": 35, "city": "Los Angeles"}
]

In the above, you can see that the entire string has to be parsed as a whole to get the data, unlike JSON Lines where each line is independently a JSON object.

Data Streaming#

[1]:
import requests
import pandas as pd
import json

pair_url = "https://tradingstrategy.ai/api/candles-jsonl?pair_ids=2699634&time_bucket=1d"

def get_candles(url: str) -> pd.DataFrame:
    x = requests.get(url)

    data = x.text.split('\n')  # split by newline character

    json_objects = []
    for line in data:
        try:
            json_objects.append(json.loads(line))
        except:
            pass


    candles = pd.DataFrame(json_objects)
    candles.rename(columns = {'ts':'date','o':'open', 'h':'high','l':'low','c':'close','v':'volume'}, inplace = True)
    candles['timestamp'] = pd.to_datetime(candles['date'])
    candles = candles.set_index('date')
    return candles

pair_data = get_candles(pair_url)

display(pair_data.head())
open high low close volume xr b s tc bv sv p sb eb timestamp
date
1623196800 0.751273 0.761408 0.751273 0.753983 46404.942246 1.0 None None None None None 2699634 12599483 12602480 1970-01-01 00:00:01.623196800
1623283200 0.753983 0.754360 0.749023 0.754360 27170.926637 1.0 None None None None None 2699634 12607138 12608081 1970-01-01 00:00:01.623283200
1623369600 0.754511 0.754511 0.754511 0.754511 556.739452 1.0 None None None None None 2699634 12611519 12611519 1970-01-01 00:00:01.623369600
1623456000 0.749398 0.753380 0.733605 0.753380 126186.194150 1.0 None None None None None 2699634 12618259 12619487 1970-01-01 00:00:01.623456000
1623542400 0.744022 0.751574 0.744022 0.751574 44663.119939 1.0 None None None None None 2699634 12628856 12628860 1970-01-01 00:00:01.623542400

Data Visualisation#

[2]:
from tradingstrategy.charting.candle_chart import visualise_ohlcv

def get_figure(candles: pd.DataFrame, chart_name: str):
    return visualise_ohlcv(
        candles,
        height=600,
        theme="plotly_white",
        chart_name=chart_name,
        y_axis_name="Price",
        volume_axis_name="volume",
    )

fig = get_figure(pair_data, "XSGD/USDC")

fig.show()