Lux Docs
Lux Skills Reference

Lux Monitoring

Full Observability Stack

Overview

Lux Monitoring provides a complete observability stack for Lux validators and network services. It ships two Docker Compose configurations -- one for Lux mainnet node monitoring and one for DEX-specific monitoring -- with pre-built Grafana dashboards, Prometheus scrape targets, Loki log aggregation, and Alertmanager routing. There is also a standalone monitoring-installer.sh shell script for bare-metal Ubuntu validators that installs Prometheus, Grafana, and node_exporter via systemd.

Quick Reference

ItemValue
Repogithub.com/luxfi/monitoring
StackDocker Compose
Public URLmonitor.lux.network
Grafana default credsadmin / luxnetwork
Grafana port3100 (mapped from container 3000)
Prometheus port9090
Loki port3101 (mapped from container 3100)
Alertmanager port9093
Node Exporter port9100
Postgres Exporter port9187
TSDB retention30 days
Loki retention30 days (720h)

Compose Files

There is no root compose.yml. Two compose files exist:

compose.luxnet.yml -- Lux Network Monitoring

The primary stack. Starts 8 services:

ServiceImageContainer NamePurpose
luxdubuntu:22.04lux-nodeLux node (network-id 96369, ports 9630/9631/8546)
prometheusprom/prometheus:latestlux-prometheusMetrics TSDB, 30d retention
grafanagrafana/grafana:latestlux-grafanaDashboard UI
lokigrafana/loki:latestlux-lokiLog aggregation (TSDB schema v13)
promtailgrafana/promtail:latestlux-promtailLog shipping (Docker, luxd, syslog, Blockscout)
node-exporterprom/node-exporter:latestlux-node-exporterSystem metrics
postgres-exporterprometheuscommunity/postgres-exporter:latestlux-postgres-exporterPostgreSQL metrics
alertmanagerprom/alertmanager:latestlux-alertmanagerAlert routing

Networks: lux-monitoring (internal bridge), hanzo-network (external, shared with other Hanzo services).

Volumes: prometheus_data, grafana_data, loki_data, alertmanager_data, luxd_data.

compose.dex.yml -- DEX Extension

Extends the main stack with DEX-specific services:

ServiceContainer NamePurpose
dex-exporterlux-dex-exporterNode exporter for DEX host (port 9101)
dex-metrics-aggregatorlux-dex-metricsCollects DEX engine/consensus/HFT metrics
vectorlux-dex-vectorHigh-perf log aggregation (Vector, port 8686)
jaegerlux-dex-jaegerDistributed tracing (Jaeger UI port 16686)
cadvisorlux-dex-cadvisorContainer metrics (port 8088)

Additional Grafana plugins installed: grafana-piechart-panel, grafana-worldmap-panel.

Pre-configured Dashboards

13 dashboard JSON files across three directories:

Core Dashboards (grafana/dashboards/)

FileDashboardKey Panels
c_chain.jsonC-ChainEVM block production, gas usage, tx throughput
c_chain_load.jsonC-Chain LoadBlock gas utilization, pending tx pool depth
p_chain.jsonP-ChainValidator set size, staking metrics, subnet creation
x_chain.jsonX-ChainUTXO transactions, asset creation rates
machine.jsonMachine MetricsCPU, memory, disk I/O, network bandwidth
database.jsonDatabasePostgreSQL connections, query perf, replication
logs.jsonLogsStructured log search via Loki datasource
main.jsonMain OverviewNetwork-wide health summary, peer count
network.jsonNetworkP2P message latency, bandwidth in/out, warp delivery
subnets.jsonSubnetsSubnet health, validator participation
lux-comprehensive.jsonComprehensiveAll-in-one overview dashboard

DEX Dashboard (grafana/dashboards/dex/)

FileDashboard
dex-performance.jsonOrder matching latency (p50/p95/p99/p999), throughput, DPDK/RDMA metrics

MorpheusVM Dashboard (grafana/dashboards/morpheusvm/)

FileDashboard
performance.jsonMorpheusVM execution performance

Grafana Provisioning

Datasources (grafana/provisioning/datasources/datasources.yml)

NameTypeURL
Prometheusprometheushttp://prometheus:9090 (default, POST method)
Lokilokihttp://loki:3100 (max 5000 lines)
Lux Nodeprometheushttp://host.docker.internal:9650/ext/metrics
PostgreSQLpostgreslux-postgres:5432, database explorer_luxnet

Dashboard Provider (grafana/provisioning/dashboards/dashboards.yml)

Provider name: Lux Network Dashboards, folder: Lux Network, uid: lux-network. Auto-updates every 10s and allows UI edits.

Prometheus Scrape Configuration

Main Config (prometheus/prometheus.yml)

Global scrape interval: 15s. Evaluation interval: 15s.

Job NameMetrics PathTargetInstance Label
prometheus/metricslocalhost:9090--
node/metricsnode-exporter:9100--
luxd/ext/metricsluxd:9630, host.docker.internal:9630lux-mainnet
c-chain/ext/bc/C/metricsluxd:9630c-chain-mainnet
c-chain-rpc/ext/bc/C/rpcluxd:9630c-chain-rpc
x-chain/ext/bc/X/metricsluxd:9630x-chain-mainnet
p-chain/ext/bc/P/metricsluxd:9630p-chain-mainnet
lux-health/ext/healthluxd:9630lux-health
blockscout-lux/metricshost.docker.internal:4000blockscout-lux-mainnet
postgres/metricspostgres-exporter:9187--
grafana/metricsgrafana:3000--
loki/metricsloki:3100--

DEX Config (prometheus/dex-prometheus.yml)

16 additional scrape jobs for DEX monitoring with intervals down to 100ms for HFT metrics:

JobScrape IntervalTargetPurpose
dex-engine15slux-dex-node:9090Order book engine
dex-matching1slux-dex-node:8080Match latency (HFT)
dex-websocket15slux-dex-node:8081WebSocket feed
dex-consensus15slux-dex-node:5000FPC consensus (K=1)
dex-quantum15slux-dex-node:8080Ringtail-BLS signatures
dex-hft100mslux-dex-node:8080Ultra-HF trading
dex-dpdk15slux-dex-node:8080Kernel bypass (DPDK)
dex-rdma15slux-dex-node:8080Zero-copy replication
dex-gpu15slux-dex-node:8080MLX/CUDA acceleration

DEX Recording Rules (prometheus/dex-rules.yml)

Evaluated every 5s:

RuleExpression
dex:matching_latency:p50histogram_quantile(0.50, rate(dex_matching_latency_bucket[1m]))
dex:matching_latency:p95histogram_quantile(0.95, ...)
dex:matching_latency:p99histogram_quantile(0.99, ...)
dex:matching_latency:p999histogram_quantile(0.999, ...)
dex:orders_per_secondrate(dex_orders_processed_total[1m])
dex:trades_per_secondrate(dex_trades_executed_total[1m])
dex:consensus_round_timeavg rate of dex_consensus_round_duration_seconds
dex:consensus_finality_timehistogram_quantile(0.95, rate(dex_consensus_finality_bucket[1m]))

Alert Rules

Chain Alerts (prometheus/alerts/chain-alerts.yml)

Evaluated every 30s:

AlertExpressionForSeverity
LuxNodeDownup\{job="luxd"\} == 05mcritical
CChainNotSyncingrate(lux_C_last_accepted_height[5m]) == 010mwarning
CChainHighRejectionRaterejection > 10% of accepted5mwarning
PChainNotSyncingrate(lux_P_last_accepted_height[5m]) == 010mwarning
XChainNotSyncingrate(lux_X_last_accepted_height[5m]) == 010mwarning
LowPeerCountlux_network_peers < 510mwarning
HighCPUUsageprocess_cpu_seconds_total > 80%10mwarning
HighMemoryUsageMemAvailable < 10%10mcritical
PostgresDownup\{job="postgres"\} == 05mcritical

DEX Alerts (prometheus/dex-rules.yml)

AlertThresholdSeverity
DEXHighLatencyp99 > 1mscritical
DEXElevatedLatencyp95 > 500uswarning
DEXLowThroughput< 1M trades/secwarning
DEXConsensusFailureconsensus node downcritical
DEXHighMemoryUsage> 90% memorywarning
DEXDatabaseDownPostgreSQL downcritical
DEXWebSocketDrops> 10 disconnects/secwarning
DEXOrderBookDesyncsync errors > 0critical
DEXPacketDrops> 100 DPDK drops/secwarning
DEXGPUFailureGPU unavailablewarning

Alertmanager Configuration (alertmanager/alertmanager.yml)

  • Route grouping: by alertname, cluster, service
  • Group wait: 10s, repeat interval: 12h
  • Critical alerts route to critical receiver (configure webhook/email)
  • Inhibition: critical suppresses warning for same alertname/instance

Loki Configuration

  • Schema: TSDB v13 with filesystem storage
  • Ingestion rate: 16 MB/s (burst 32 MB/s)
  • Per-stream rate limit: 2 MB/s (burst 4 MB/s)
  • Max entries per query: 5000
  • Reject samples older than 168h (7 days)
  • Retention: 720h (30 days)
  • Query cache: 100 MB embedded cache

Promtail Log Sources

JobPath PatternPipeline
docker/var/lib/docker/containers/*/*logJSON parse, container name extraction
luxd/lux-logs/network_current/node*/logs/*.logMultiline, regex for timestamp/level/logger
syslog/var/log/syslogRegex for hostname/program/pid
blockscoutDocker container logsJSON parse, stream filtering

Bare-metal Installer

grafana/monitoring-installer.sh -- 5-step installer for Ubuntu validators:

  1. --1 Install Prometheus (systemd service)
  2. --2 Install Grafana (systemd service)
  3. --3 Install node_exporter (systemd service)
  4. --4 Install Lux Grafana dashboards
  5. --5 Install additional dashboards (optional)

Supports both amd64 and arm64 architectures. Run without arguments to download latest dashboards only.

Quickstart

git clone https://github.com/luxfi/monitoring.git
cd monitoring

# Create external network if needed
docker network create hanzo-network

# Start the full Lux monitoring stack
./start-monitoring.sh
# Or directly:
docker compose -f compose.luxnet.yml up -d

# Start with DEX monitoring extension
docker compose -f compose.luxnet.yml -f compose.dex.yml up -d

# Access points:
#   Grafana:       http://localhost:3100 (admin/luxnetwork)
#   Prometheus:    http://localhost:9090
#   Loki:          http://localhost:3101
#   Alertmanager:  http://localhost:9093
#   Jaeger (DEX):  http://localhost:16686

Production Deployment

deploy-monitor-nginx.sh configures nginx reverse proxy for monitor.lux.network:

  1. Symlinks nginx config to /etc/nginx/sites-enabled/
  2. Tests and reloads nginx
  3. Requires Cloudflare DNS A record pointing to the server
  4. Public URL: https://monitor.lux.network
  • lux/lux-node.md -- Node that exports metrics at /ext/metrics
  • lux/lux-universe.md -- Production K8s infrastructure

On this page