快速诊断
节点健康检查脚本
Copy
Ask AI
#!/bin/bash
# quick-diagnosis.sh
# 设置服务名称(默认:stable)
export SERVICE_NAME=stable
echo "=== Stable 节点诊断 ==="
echo "时间戳:$(date)"
echo ""
# 1. 服务状态
echo "1. 服务状态:"
systemctl status ${SERVICE_NAME} --no-pager | head -10
# 2. 同步状态
echo -e "\n2. 同步状态:"
curl -s localhost:26657/status | jq '.result.sync_info' 2>/dev/null || echo "RPC 无响应"
# 3. 对等节点连接
echo -e "\n3. 对等节点数量:"
curl -s localhost:26657/net_info | jq '.result.n_peers' 2>/dev/null || echo "无法获取对等节点信息"
# 4. 最近错误
echo -e "\n4. 最近错误(最后 20 条):"
sudo journalctl -u ${SERVICE_NAME} --since "1 hour ago" | grep -i error | tail -20
# 5. 系统资源
echo -e "\n5. 系统资源:"
df -h / | grep -v Filesystem
free -h | grep Mem
top -bn1 | grep "load average"
# 6. 端口状态
echo -e "\n6. 端口状态:"
ss -tulpn | grep ${SERVICE_NAME} || echo "未找到 ${SERVICE_NAME} 端口"
echo -e "\n=== 诊断完成 ==="
常见问题和解决方案
节点无法启动
问题:找不到二进制文件
错误信息:Copy
Ask AI
stabled: command not found
Copy
Ask AI
# 检查二进制文件是否存在
ls -la /usr/bin/stabled
# 如果缺失,重新安装(如需要使用 arm64)
wget https://stable-testnet-data.s3.us-east-1.amazonaws.com/v7/stabled-0.7.2-testnet-linux-amd64.tar.gz
tar -xvzf stabled-0.7.2-testnet-linux-amd64.tar.gz
sudo mv stabled /usr/bin/
sudo chmod +x /usr/bin/stabled
问题:权限拒绝
错误信息:Copy
Ask AI
Error: open /home/user/.stabled/config/config.toml: permission denied
Copy
Ask AI
# 修复所有权
sudo chown -R $USER:$USER ~/.stabled/
# 修复权限
chmod 700 ~/.stabled/
chmod 600 ~/.stabled/config/*.json
chmod 644 ~/.stabled/config/*.toml
问题:地址已被使用
错误信息:Copy
Ask AI
Error: listen tcp 0.0.0.0:26657: bind: address already in use
Copy
Ask AI
# 查找使用端口的进程
sudo lsof -i :26657
# 杀死进程
sudo kill -9 <PID>
# 或更改配置中的端口
sed -i 's/laddr = "tcp:\/\/0.0.0.0:26657"/laddr = "tcp:\/\/0.0.0.0:26658"/' ~/.stabled/config/config.toml
同步问题
问题:节点在特定高度卡住
症状:- 区块高度不增长
- 超过 1 分钟无新区块
Copy
Ask AI
# 1. 检查对等节点
curl localhost:26657/net_info | jq '.result.n_peers'
# 如果没有对等节点,添加持久对等节点
echo "persistent_peers = \"5ed0f977a26ccf290e184e364fb04e268ef16430@37.187.147.27:26656,128accd3e8ee379bfdf54560c21345451c7048c7@37.187.147.22:26656\"" >> ~/.stabled/config/config.toml
# 2. 重置并重新同步
sudo systemctl stop ${SERVICE_NAME}
stabled comet unsafe-reset-all --keep-addr-book
sudo systemctl start ${SERVICE_NAME}
# 3. 使用快照(参见快照指南)
问题:“wrong Block.Header.AppHash” 错误
错误信息:Copy
Ask AI
panic: Wrong Block.Header.AppHash. Expected XXXX, got YYYY
Copy
Ask AI
# 这表示状态损坏 - 回滚到上一个区块
sudo systemctl stop ${SERVICE_NAME}
# 回滚一个区块
stabled rollback
# 重启节点
sudo systemctl start ${SERVICE_NAME}
# 如果回滚无效,从快照恢复
# 备份重要文件
cp ~/.stabled/config/priv_validator_key.json ~/backup/
cp ~/.stabled/config/node_key.json ~/backup/
# 重置状态
stabled comet unsafe-reset-all
# 从快照恢复
wget https://stable-snapshot.s3.eu-central-1.amazonaws.com/snapshot.tar.lz4
tar -I lz4 -xf snapshot.tar.lz4 -C ~/.stabled/
sudo systemctl start ${SERVICE_NAME}
对等节点连接问题
问题:无对等节点连接
症状:Copy
Ask AI
"n_peers": 0
Copy
Ask AI
# 1. 检查防火墙
sudo ufw status
sudo ufw allow 26656/tcp
# 2. 检查外部 IP
curl ifconfig.me
# 3. 更新外部地址
sed -i "s/external_address = .*/external_address = \"$(curl -s ifconfig.me):26656\"/" ~/.stabled/config/config.toml
# 4. 添加种子节点
cat >> ~/.stabled/config/config.toml <<EOF
seeds = "seed1@seed1.stable.network:26656,seed2@seed2.stable.network:26656"
EOF
# 5. 启用 PEX
sed -i 's/pex = false/pex = true/' ~/.stabled/config/config.toml
sudo systemctl restart ${SERVICE_NAME}
数据库问题
问题:“数据库损坏”
错误信息:Copy
Ask AI
Error initializing database: resource temporarily unavailable
Copy
Ask AI
# 1. 停止节点
sudo systemctl stop ${SERVICE_NAME}
# 2. 检查磁盘空间
df -h ~/.stabled
# 3. 修复数据库
stabled debug kill-db ~/.stabled/data
stabled debug dump-db ~/.stabled/data > db_dump.txt
# 4. 如果修复失败,重新同步
rm -rf ~/.stabled/data
# 从快照恢复
# 5. 启动节点
sudo systemctl start ${SERVICE_NAME}
内存问题
问题:内存不足(OOM)杀死进程
症状:Copy
Ask AI
stabled.service: Main process exited, code=killed, status=9/KILL
Copy
Ask AI
# 1. 检查内存使用情况
free -h
dmesg | grep -i "killed process"
# 2. 添加交换空间
sudo fallocate -l 8G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab
# 3. 优化内存使用
cat >> ~/.stabled/config/app.toml <<EOF
iavl-cache-size = 781250 # 低内存时减少
inter-block-cache = false # 低内存时禁用
EOF
磁盘空间问题
问题:设备上没有剩余空间
错误信息:Copy
Ask AI
Error: write ~/.stabled/data/blockstore.db/001234.log: no space left on device
Copy
Ask AI
# 1. 检查磁盘使用情况
df -h
du -sh ~/.stabled/*
# 2. 清理日志
sudo journalctl --vacuum-time=7d
sudo journalctl --vacuum-size=500M
# 3. 修剪区块链数据
sudo systemctl stop ${SERVICE_NAME}
stabled prune
# 4. 删除旧快照
rm -rf ~/.stabled/data/snapshots/
# 5. 迁移到更大磁盘
# 参见下面的迁移部分
错误信息参考
| 错误 | 原因 | 解决方案 |
|---|---|---|
wrong Block.Header.AppHash | 状态损坏 | 从快照重新同步 |
validator set is nil | 创世文件不匹配 | 下载正确的创世文件 |
connection refused | 服务未运行 | 启动服务 |
timeout waiting for tx to be included | 网络拥塞 | 增加 gas 价格 |
account sequence mismatch | Nonce 错误 | 查询当前 nonce |
insufficient fees | Gas 价格过低 | 增加 gas 价格 |
signature verification failed | 密钥不匹配 | 检查密钥配置 |
module account has not been set | 初始化错误 | 重新初始化节点 |
获取帮助
收集调试信息
Copy
Ask AI
#!/bin/bash
# collect-debug-info.sh
# 设置服务名称(默认:stable)
export SERVICE_NAME=stable
OUTPUT_DIR="stable-debug-$(date +%Y%m%d-%H%M%S)"
mkdir -p $OUTPUT_DIR
echo "收集调试信息..."
# 系统信息
uname -a > $OUTPUT_DIR/system.txt
df -h >> $OUTPUT_DIR/system.txt
free -h >> $OUTPUT_DIR/system.txt
# 服务状态
systemctl status ${SERVICE_NAME} --no-pager > $OUTPUT_DIR/service-status.txt
# 最近日志
sudo journalctl -u ${SERVICE_NAME} --since "1 hour ago" > $OUTPUT_DIR/recent-logs.txt
# 配置文件(删除敏感数据)
grep -v "priv" ~/.stabled/config/config.toml > $OUTPUT_DIR/config.toml
grep -v "priv" ~/.stabled/config/app.toml > $OUTPUT_DIR/app.toml
# 节点状态
curl -s localhost:26657/status > $OUTPUT_DIR/node-status.json 2>/dev/null
# 创建归档
tar -czf $OUTPUT_DIR.tar.gz $OUTPUT_DIR/
echo "调试信息已收集:$OUTPUT_DIR.tar.gz"
echo "请求支持时分享此文件"

