HTTP Fetching and JSON Parsing in Off-chain Workers

pallets/ocw-demo Try on playground View on GitHub

HTTP Fetching

In traditional web apps, we use HTTP requests to communicate with and fetch data from third-party APIs. But this is tricky when we want to perform this in Substrate runtime on chain. First, HTTP requests are indeterministic. There are uncertainty in terms of how long the request will take, and the result may not be the same all the time. This causes problem for the network reaching consensus.

So in Substrate runtime, we use off-chain workers to issue HTTP requests and fetching the results back.

In this chapter, we will dive into fetching information of the GitHub organization substrate-developer-hub that hosts this recipe, specifically, the login, blog, and public_repos values out using GitHub public API.

We issue an http request and return the JSON string in byte vector inside the fetch_from_remote() function.

src: pallets/ocw-demo/src/lib.rs


#![allow(unused)]
fn main() {
// Initiate an external HTTP GET request. This is using high-level wrappers from `sp_runtime`.
let request = rt_offchain::http::Request::get(HTTP_REMOTE_REQUEST);

// Keeping the offchain worker execution time reasonable, so limiting the call to be within 3s.
let timeout = sp_io::offchain::timestamp()
	.add(rt_offchain::Duration::from_millis(FETCH_TIMEOUT_PERIOD));

// For github API request, we also need to specify `user-agent` in http request header.
//   See: https://developer.github.com/v3/#user-agent-required
let pending = request
	.add_header("User-Agent", HTTP_HEADER_USER_AGENT)
	.deadline(timeout) // Setting the timeout time
	.send() // Sending the request out by the host
	.map_err(|_| <Error<T>>::HttpFetchingError)?;

// By default, the http request is async from the runtime perspective. So we are asking the
//   runtime to wait here.
// The returning value here is a `Result` of `Result`, so we are unwrapping it twice by two `?`
//   ref: https://substrate.dev/rustdocs/v2.0.0-rc6/sp_runtime/offchain/http/struct.PendingRequest.html#method.try_wait
let response = pending
	.try_wait(timeout)
	.map_err(|_| <Error<T>>::HttpFetchingError)?
	.map_err(|_| <Error<T>>::HttpFetchingError)?;

if response.code != 200 {
	debug::error!("Unexpected http request status code: {}", response.code);
	return Err(<Error<T>>::HttpFetchingError);
}

// Next we fully read the response body and collect it to a vector of bytes.
Ok(response.body().collect::<Vec<u8>>())
}

We first create a request object. We also set a timeout period so the http request does not hold indefinitely with .deadline(timeout). For querying github APIs, we also need to add an extra HTTP header with add_header(...). HTTP requests from off-chain workers are fetched asynchronously. Here we use try_wait() to wait for the result to come back, and terminate and return if any errors occured.

We then check the response status code to ensure it is okay (equals to 200). Any status code that is non-200 is regarded as error and return. Finally, we get the response back from response.body() iterator. Since we are in a no_std environment, we collect them back as a byte vector instead of a string and return.

JSON Parsing

We frequently get JSON back when requesting from HTTP APIs. The next task is to parse the JSON and fetch the required (key, value) pair out. This is demonstrated in the fetch_n_parse function.

Setup

In Rust, serde and serde-json are the popular combo-package used for JSON parsing. Due to the project setup of compiling Substrate with serde feature std on and cargo feature unification limitation, we cannot simultaneously have serde feature std off (no_std on) when compiling the runtime (this is a bit of a mouthful, the details can be seen in this issue). So we are going to use a renamed serde crate, alt_serde, in our pallet to remedy this situation.

src: pallets/ocw-demo/Cargo.toml

[package]
# ...

[dependencies]
# external dependencies
# ...

alt_serde = { version = "1", default-features = false, features = ["derive"] }
serde_json = { version = "1", default-features = false, git = "https://github.com/Xanewok/json", branch = "no-std", features = ["alloc"] }

# ...

We also use a modified version of serde_json with the latest alloc feature that depends on alt_serde.

Deserializing JSON string to struct

Then we use the usual serde-derive approach on deserializing. First we define the struct with fields that we are interested to extract out.

src: pallets/ocw-demo/src/lib.rs


#![allow(unused)]
fn main() {
// We use `alt_serde`, and Xanewok-modified `serde_json` so that we can compile the program
//   with serde(features `std`) and alt_serde(features `no_std`).
use alt_serde::{Deserialize, Deserializer};

// Specifying serde path as `alt_serde`
// ref: https://serde.rs/container-attrs.html#crate
#[serde(crate = "alt_serde")]
#[derive(Deserialize, Encode, Decode, Default)]
struct GithubInfo {
	// Specify our own deserializing function to convert JSON string to vector of bytes
	#[serde(deserialize_with = "de_string_to_bytes")]
	login: Vec<u8>,
	#[serde(deserialize_with = "de_string_to_bytes")]
	blog: Vec<u8>,
	public_repos: u32,
}
}

By default, serde deserialize JSON string to the datatype String. We want to write our own deserializer to convert it to vector of bytes.


#![allow(unused)]
fn main() {
pub fn de_string_to_bytes<'de, D>(de: D) -> Result<Vec<u8>, D::Error>
where D: Deserializer<'de> {
	let s: &str = Deserialize::deserialize(de)?;
	Ok(s.as_bytes().to_vec())
}
}

Now the actual deserialization takes place in the fetch_n_parse function.


#![allow(unused)]
fn main() {
/// Fetch from remote and deserialize the JSON to a struct
fn fetch_n_parse() -> Result<GithubInfo, Error<T>> {
	let resp_bytes = Self::fetch_from_remote()
		.map_err(|e| {
			debug::error!("fetch_from_remote error: {:?}", e);
			<Error<T>>::HttpFetchingError
		})?;

	let resp_str = str::from_utf8(&resp_bytes)
		.map_err(|_| <Error<T>>::HttpFetchingError)?;

	// Deserializing JSON to struct, thanks to `serde` and `serde_derive`
	let gh_info: GithubInfo = serde_json::from_str(&resp_str).unwrap();
	Ok(gh_info)
}
}