Haf Mirror from Syncad

Go to file

Eric Frias 05b3e1d22b Fix --skip-hived docker entrypoint flag, which was broken when we started deploying minimal docker images without python. fixes #225		2024-04-20 22:51:49 +00:00
cmake	Install save and restore sql script along the update script	2024-02-21 18:23:43 +01:00
common_includes/include	psql_utils: decrease log severity for RAII exceptions	2023-09-07 14:09:11 +00:00
doc	Remove phrases like 'instance-' and 'base_instance-' from Docker tags.	2023-12-07 12:50:19 +01:00
docker	Fix --skip-hived docker entrypoint flag, which was broken when we	2024-04-20 22:51:49 +00:00
example_deployment	haf_app_admin usage removed from example deployment files	2024-01-28 22:15:04 +00:00
hive@2760a93ce2	Unification of `state_provider` tests	2024-04-08 09:15:44 +02:00
scripts	Fix --skip-hived docker entrypoint flag, which was broken when we	2024-04-20 22:51:49 +00:00
src	Update dockerfile to postgresql 16	2024-04-12 18:32:17 +02:00
tests	force_parallel_mode renamed to debug_parallel_query in Postgres 16+	2024-04-12 19:27:12 +02:00
.dockerignore	.dockerfile extended by ignoring Dockerfile itself, like also *.log files pattern to support in-source docker build (previously image has been invalidated because of log changes)	2022-06-17 12:11:50 +02:00
.gitattributes	Add .gitattributes for consistent LF line endings	2022-08-06 06:47:38 +00:00
.gitignore	Minor fixes in tests after helpy integration	2023-12-05 09:28:47 +00:00
.gitlab-ci.yml	Fix --skip-hived docker entrypoint flag, which was broken when we	2024-04-20 22:51:49 +00:00
.gitmodules	Disable lint on hived builds.	2022-04-16 18:33:06 +00:00
CMakeLists.txt	Update dockerfile to postgresql 16	2024-04-12 18:32:17 +02:00
Dockerfile	Fix --skip-hived docker entrypoint flag, which was broken when we	2024-04-20 22:51:49 +00:00
Dockerfile.jmeter	Bump poetry version in ci-base-image to 1.7.1	2024-01-08 15:15:12 +01:00
Dockerfile.jmeter.dockerignore	Added custom image for block_api_tests - ref. #127	2023-04-13 11:00:04 +02:00
README.md	Update README.md	2023-12-16 20:00:07 +00:00
dataflow.txt	Initial version of SQL Serializer dataflow description.	2021-12-16 13:39:27 +01:00

README.md

Overview of the Hive Application Framework (HAF)

The Hive Application Framework was developed to simplify the creation of highly scalable, blockchain-based applications. HAF-based apps are naturally resilient against blockchain forks because HAF contains a mechanism for automatically undoing data generated by forked out blocks.

HAF servers act as an intermediary between the Hive network and Hive applications. HAF-based applications do not normally read data directly from Hive nodes (aka hived process) via a pull model. Instead, HAF applications receive blockchain data via a push model: a hived node is configured with a sql_serializer plugin that processes each new block as it arrives at the hived node and writes the associated blockchain data (transactions, operations, virtual operations, etc) to a Postgres database. The server where this Postgres database is running is referred to as a HAF server.

Multiple HAF-based apps can run on a single HAF server, sharing the same HAF database, with each HAF app creating a separate schema where it stores app-specific data.

Since HAF servers receive their data via a push model, they impose a fixed amount of load on the hived node that supplies blockchain data, regardless of the number of HAF apps running on the server. In other words, while too many apps may load down the postgres database and affect the performance of other apps, the hived node supplying the data should continue to function without any problems.

HAF-app users publish transactions on the Hive network when they want to send data to a HAF-based app. Typically these transactions contain custom_json operations that contain information specifically customized for one or more HAF apps. These operations then get included into the blockchain and thereafter inserted into the HAF database for further processing by any HAF application that is watching for those particular operations. In other words, user actions aren't directly sent to app servers. Instead, they are published to the hived peer-to-peer network, included into the decentralized storage of the Hive blockchain, and then indirectly processed by HAF servers reading data from the blockchain.

An understanding of Hive's custom_json operations is critical to developing an interactive Hive app. A custom_json operation allows a user to embed one or more pieces of arbitrary json data into a Hive transaction. Interactive hive apps can utilize this feature to create a set of "commands" that their app recognizes and will process when a user publishes a transaction containing those commands.

The image above shows the main components of a HAF installation:

HIVED HAF requires a hived node which syncs blocks with other hived nodes in the Hive peer-to-peer network and pushes this data into the HAF database. This hived node doesn't need to be located on the HAF server itself, although in some cases this may allow for faster filling of a HAF database that needs to be massively synced (i.e. when you need to fill a database with a lot of already-produced blockchain blocks).
SQL_SERIALIZER sql_serializer is the hived plugin which is responsible for pushing the data from blockchain blocks into the HAF database. The plugin also informs the database about the occurrence of microforks (in which case HAF has to revert database changes that resulted from the forked out blocks). It also signals the database when a block has become irreversible (no longer revertable via a fork), so that the info from that block can be moved from the "reversible" tables inside the database to the "irreversible" tables. Detailed documentation for the sql_serializer is here: src/sql_serializer/README.md
PostgreSQL database A HAF database contains data from blockchain blocks in the form of SQL tables (these tables are stored in the "hive" schema inside the database), and it also contains tables for the data generated by HAF apps running on the HAF server (each app has its own separate schema to encapsulate its data). The system utilizes Postgres authentication and authorization mechanisms to protect HAF-based apps from interfering with each other.
HIVE FORK MANAGER is a PostgreSQL extension that implements HAF's API inside the "hive" schema. This extension must be included when creating a new HAF database. This extension defines the format of block data saved in the database. It also defines a set of SQL stored procedures that are used by HAF apps to get data about the blocks. The SQL_SERIALIZER dumps blocks to the tables defined by the hive_fork_manager. This extension defines the process by which HAF apps consume blocks, and ensures that apps cannot corrupt each other's data. The hive_fork_manager is also responsible for rewinding the state of the tables of all the HAF apps running on the server in the case of a micro-fork occurrence. Detailed documentation for hive_fork_manager is here: src/hive_fork_manager/Readme.md

HAF server quickstart

NOTE: The fastest and easiest way to install and maintain a HAF server is to use the docker compose scripts in this repo: https://gitlab.syncad.com/hive/haf_api_node

But if you prefer to build your own HAF docker image, you can follow the steps below:

  git clone --recurse --branch develop https://gitlab.syncad.com/hive/haf.git
  mkdir -p workdir/haf-datadir/blockchain
  cd workdir
  ../haf/scripts/ci-helpers/build_instance.sh local-develop ../haf/ registry.gitlab.syncad.com/hive/haf/

Now you can either sync your hived node from scratch via the Hive P2P network, or you can download a copy of the blockchain's block_log file from a location you trust (e.g. https://gtg.openhive.network/get/blockchain/compressed/block_log) and replay the block_log. The latter method is typically faster, because a replay doesn't re-validate the blocks, but the first method (syncing from scratch) requires the least trust.

To start your HAF server, type:

  ../haf/scripts/run_hived_img.sh registry.gitlab.syncad.com/hive/haf/instance:local-develop --name=haf-instance --webserver-http-endpoint=8091 --webserver-ws-endpoint=8090  --data-dir=$(pwd)/haf-datadir --replay

If you don't have a local block_log file, just remove the --replay option from the command line above to get the blockchain blocks using the P2P network via the normal sync procedure.

It is advisable to have your own custom PostgreSQL config file in order to have PostgreSQL logs available locally and specify custom database access permissions. To do that, before starting your HAF server, just copy doc/haf_postgresql_conf.d containing configuration files where you can override any PostgreSQL setting.

The steps above should create a haf-datadir/haf_db_store subdirectory containig a PostgreSQL database holding HAF data and haf-datadir/hived.log containing the output of the underlying hived process.

Use docker container stop haf-instance to safely stop the service.

See dockerized deployment details for further details.

HAF manual build and deloyment steps are described here: doc/HAF_Detailed_Deployment.md