Architecture Overview

This document provides a high-level overview of how apphash.io works and how the different components interact.

System Architecture

┌─────────────────────────────────────────────────────────────┐
│                     Cosmos SDK Chain                         │
│  ┌────────────────────────────────────────────────────┐     │
│  │                    App.go                          │     │
│  │  ┌──────────────────────────────────────┐         │     │
│  │  │      Streaming Listeners             │         │     │
│  │  │  • Captures KV store changes         │         │     │
│  │  │  • Tracks state diffs per block      │         │     │
│  │  │  • Monitors consensus events         │         │     │
│  │  └──────────────┬───────────────────────┘         │     │
│  └─────────────────┼──────────────────────────────────┘     │
│                    │                                         │
│  ┌─────────────────▼──────────────────────────────────┐     │
│  │              Memlogger                             │     │
│  │  • In-memory buffering                             │     │
│  │  • Asynchronous compression (gzip)                 │     │
│  │  • Message filtering                               │     │
│  │  • Object pooling                                  │     │
│  └─────────────────┬──────────────────────────────────┘     │
│                    │                                         │
│  ┌─────────────────▼──────────────────────────────────┐     │
│  │         Write-Ahead Log (WAL)                      │     │
│  │  • Compressed segments (.wal.gz)                   │     │
│  │  • Index files (.wal.idx)                          │     │
│  │  • Daily rotation                                  │     │
│  │  Location: $CHAIN_DIR/data/log.wal/               │     │
│  └─────────────────┬──────────────────────────────────┘     │
└────────────────────┼──────────────────────────────────────────┘
                     │
         ┌───────────▼──────────┐
         │  Analyzer Shipper    │
         │  • Watches WAL dir   │
         │  • Ships to platform │
         └───────────┬──────────┘
                     │
         ┌───────────▼──────────┐
         │  apphash.io Platform │
         │  • Analysis          │
         │  • Visualization     │
         │  • Alerting          │
         └──────────────────────┘

Component Overview

1. Streaming Listeners (App.go)

Purpose: Capture state changes at the application layer

Functionality:

Hooks into CommitMultiStore
Listens to all KV store operations
Captures state diffs during block commits
Sends data to DebugChangeLogger

Location: Your chain’s app.go

2. Memlogger

Purpose: Efficient log buffering and compression

Key Features:

In-memory buffering: Collects logs before writing to disk
Asynchronous compression: Compresses logs in background goroutines
Message filtering: Allows filtering to consensus-critical events only
Object pooling: Zero-allocation design for high performance
Time/size-based flushing: Configurable triggers for persistence

Location: Integrated into Cosmos SDK

3. Write-Ahead Log (WAL)

Purpose: Persistent storage of compressed logs

Structure:

$CHAIN_DIR/data/log.wal/
└── <node-id>/
    └── <yyyy-mm-dd>/
        ├── seg-000001.wal.gz    # Compressed log data
        ├── seg-000001.wal.idx   # Index for efficient replay
        ├── seg-000002.wal.gz
        ├── seg-000002.wal.idx
        └── ...

Features:

Automatic segment rotation
Platform-optimized fsync
Chronological ordering preserved
Efficient compression (90%+ reduction)

4. Analyzer Shipper

Purpose: Ship logs from node to apphash.io platform

Functionality:

Monitors WAL directory for new segments
Uploads compressed logs to apphash.io
Handles retries and error recovery
Maintains checkpointing for reliability

Repository: walship

5. apphash.io Platform

Purpose: Analysis, visualization, and alerting

Capabilities:

Real-time monitoring of state changes
Detection of non-determinism
Root cause analysis
Historical data exploration
Alert configuration

Data Flow

1. Block Processing

Transaction → State Machine → KV Store Changes → Streaming Listener

When a block is processed:

Transactions modify application state
Changes are written to KV stores
Streaming listeners capture the changes
DebugChangeLogger formats the data

2. Log Capture

DebugChangeLogger → Memlogger Buffer → Compression → WAL

The memlogger process:

Receives formatted log messages
Buffers them in memory
Compresses when threshold reached (time or size)
Writes to WAL segments with index

3. Log Shipping

WAL Files → Analyzer Shipper → apphash.io Platform

Continuous shipping:

Shipper watches for new WAL segments
Uploads completed segments
Platform processes and stores data
Dashboard updates in real-time

Performance Design

Zero-Copy Architecture

The memlogger uses efficient memory management:

Object pooling for compression buffers
Minimal allocations during hot path
Batch operations to reduce syscalls

Asynchronous Processing

Non-blocking design ensures:

Consensus never waits for logging
Compression happens in background
Disk writes don’t block state machine

Filtering

Configurable filtering reduces overhead:

Allow-list for important message types
Drops non-critical debug messages
Maintains full detail for consensus events

Production Readiness

Proven Performance

Based on production benchmarks:

Metric	Impact
Memory	+10-50MB (configurable)
CPU	Negligible (async)
Disk I/O	Minimal (compressed, batched)
Network	None (local only)
Latency	No impact on block time

Reliability Features

Crash recovery: WAL ensures no data loss
Ordering guarantee: Chronological order preserved
Graceful degradation: Drops messages on failure rather than blocking
Bounded memory: Configurable limits prevent unbounded growth

Security Considerations

Local Processing Only

All compression and logging happens locally
No network calls from memlogger
Shipper is separate process with controlled access

Data Isolation

Logs are node-specific (by node-id)
No cross-contamination between nodes
Clear directory structure for access control

No Consensus Impact

Logging failures don’t affect consensus
Non-blocking design prevents deadlocks
State machine remains authoritative

Scalability

Horizontal Scaling

Each node operates independently
No coordination required between nodes
Platform scales with number of monitored chains

Storage Management

Compressed logs reduce disk usage by 90%+
Daily rotation enables easy archival
Old logs can be safely deleted
Shipper can be configured for retention policies

Next Steps

Dive deeper into specific components:

Memlogger Architecture - Detailed memlogger implementation
Log Format & Storage - WAL structure and format

Node Configuration Memlogger