从零到一:SkyWalking 9.x 与 Elasticsearch 8.x 生产环境部署实战

发布时间:2026/6/30 14:41:15
从零到一:SkyWalking 9.x 与 Elasticsearch 8.x 生产环境部署实战 1. 环境准备与组件下载在开始部署之前我们需要确保服务器环境满足基本要求。我推荐使用CentOS 7或Ubuntu 20.04 LTS这类稳定的Linux发行版实测下来这两个系统对SkyWalking和Elasticsearch的兼容性最好。服务器配置方面生产环境建议至少4核CPU、8GB内存和100GB存储空间特别是Elasticsearch对内存需求较高。先安装必要的依赖项# CentOS系统 sudo yum install -y java-11-openjdk-devel wget unzip # Ubuntu系统 sudo apt-get update sudo apt-get install -y openjdk-11-jdk wget unzip验证Java环境是否安装成功java -version # 应该显示类似openjdk version 11.0.12接下来下载所需组件。这里有个坑我踩过不同版本组合可能存在兼容性问题。经过多次测试我确认以下组合最稳定SkyWalking 9.5.0Elasticsearch 8.5.3下载地址建议使用官方镜像wget https://archive.apache.org/dist/skywalking/9.5.0/apache-skywalking-apm-9.5.0.tar.gz wget https://archive.apache.org/dist/skywalking/9.5.0/apache-skywalking-java-agent-9.5.0.tgz wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-8.5.3-linux-x86_64.tar.gz2. Elasticsearch 8.x 部署实战2.1 安装与基础配置解压文件到指定目录mkdir -p /opt/elasticsearch tar -xzf elasticsearch-8.5.3-linux-x86_64.tar.gz -C /opt/elasticsearch创建专用用户Elasticsearch不允许用root运行useradd -M -s /bin/false elasticsearch chown -R elasticsearch:elasticsearch /opt/elasticsearch修改关键配置/opt/elasticsearch/elasticsearch-8.5.3/config/elasticsearch.ymlcluster.name: skywalking-cluster node.name: node-1 path.data: /var/lib/elasticsearch path.logs: /var/log/elasticsearch network.host: 0.0.0.0 discovery.type: single-node xpack.security.enabled: false # 生产环境建议开启这里为简化先关闭2.2 系统调优与启动调整系统限制/etc/security/limits.confelasticsearch soft nofile 65535 elasticsearch hard nofile 65535 elasticsearch soft memlock unlimited elasticsearch hard memlock unlimited配置systemd服务/usr/lib/systemd/system/elasticsearch.service[Unit] DescriptionElasticsearch Afternetwork.target [Service] Userelasticsearch Groupelasticsearch EnvironmentES_HOME/opt/elasticsearch/elasticsearch-8.5.3 EnvironmentES_PATH_CONF/opt/elasticsearch/elasticsearch-8.5.3/config ExecStart/opt/elasticsearch/elasticsearch-8.5.3/bin/elasticsearch LimitNOFILE65535 LimitMEMLOCKinfinity [Install] WantedBymulti-user.target启动服务并验证systemctl daemon-reload systemctl start elasticsearch systemctl enable elasticsearch # 验证是否运行成功 curl -X GET localhost:9200/ # 应该返回包含you Know, for Search的JSON3. SkyWalking 9.x 核心组件部署3.1 OAP服务配置解压SkyWalking安装包mkdir -p /opt/skywalking tar -xzf apache-skywalking-apm-9.5.0.tar.gz -C /opt/skywalking关键配置修改/opt/skywalking/apache-skywalking-apm-bin/config/application.ymlstorage: selector: ${SW_STORAGE:elasticsearch} elasticsearch: namespace: ${SW_NAMESPACE:sw_index} clusterNodes: ${SW_STORAGE_ES_CLUSTER_NODES:localhost:9200} protocol: ${SW_STORAGE_ES_HTTP_PROTOCOL:http} user: ${SW_ES_USER:} password: ${SW_ES_PASSWORD:} dayStep: ${SW_STORAGE_DAY_STEP:1} # 存储索引按天分割 indexShardsNumber: ${SW_STORAGE_ES_INDEX_SHARDS_NUMBER:1} # 分片数 indexReplicasNumber: ${SW_STORAGE_ES_INDEX_REPLICAS_NUMBER:1} # 副本数3.2 Web UI配置修改Web应用端口/opt/skywalking/apache-skywalking-apm-bin/webapp/webapp.ymlserver: port: 12880 collector: path: /graphql ribbon: ReadTimeout: 10000 listOfServers: 127.0.0.1:128003.3 服务启动与管理创建启动脚本/opt/skywalking/start_sw.sh#!/bin/bash # 启动OAP服务 nohup /opt/skywalking/apache-skywalking-apm-bin/bin/oapService.sh /dev/null 21 # 启动Web UI nohup /opt/skywalking/apache-skywalking-apm-bin/bin/webappService.sh /dev/null 21 设置开机自启/etc/systemd/system/skywalking.service[Unit] DescriptionSkyWalking Service Afternetwork.target elasticsearch.service [Service] Typeforking ExecStart/opt/skywalking/start_sw.sh ExecStop/bin/kill -TERM $MAINPID [Install] WantedBymulti-user.target验证服务状态# 检查OAP日志 tail -f /opt/skywalking/apache-skywalking-apm-bin/logs/skywalking-oap-server.log # 应该看到Storage ElasticSearch client is connected # 访问Web UI curl -I http://localhost:12880 # 应返回HTTP 2004. 探针配置与实战技巧4.1 Java Agent部署解压探针包tar -xzf apache-skywalking-java-agent-9.5.0.tgz -C /opt/skywalking典型Java应用启动参数示例java -javaagent:/opt/skywalking/skywalking-agent/skywalking-agent.jar \ -Dskywalking.agent.service_nameyour_service_name \ -Dskywalking.collector.backend_servicelocalhost:11800 \ -jar your_application.jar4.2 高级配置技巧探针配置文件/opt/skywalking/skywalking-agent/config/agent.config关键参数# 服务名称 agent.service_name${SW_AGENT_NAME:Your_Application} # 采样率生产环境建议0.1-0.3 agent.sample_n_per_3_secs${SW_AGENT_SAMPLE:-1} # 忽略特定请求 agent.ignore_suffix${SW_AGENT_IGNORE_SUFFIX:.jpg,.jpeg,.png,.css,.js,.gif} # 跨进程传播配置 agent.cross_process_propagation${SW_AGENT_CROSS_PROCESS:true}4.3 常见问题排查如果遇到数据不显示问题按以下步骤检查确认OAP服务日志没有错误检查Elasticsearch索引是否创建成功curl -X GET localhost:9200/_cat/indices?v验证探针连接状态netstat -tulnp | grep 11800调整日志级别/opt/skywalking/apache-skywalking-apm-bin/config/log4j2.xmlRoot levelDEBUG/5. 生产环境优化建议5.1 性能调优参数Elasticsearch优化config/jvm.options-Xms4g # 初始堆内存 -Xmx4g # 最大堆内存不超过物理内存50% -XX:UseG1GCSkyWalking OAP内存调整bin/oapService.shexport JAVA_OPTS-Xms2g -Xmx2g -XX:UseG1GC5.2 安全配置启用Elasticsearch安全特性xpack.security.enabled: true xpack.security.transport.ssl.enabled: true生成密码并配置SkyWalking# 生成密码 /opt/elasticsearch/elasticsearch-8.5.3/bin/elasticsearch-reset-password -u elastic # 修改SkyWalking配置 storage: elasticsearch: user: elastic password: your_strong_password5.3 监控与维护设置定期清理任务crontab -e# 每天凌晨清理7天前索引 0 0 * * * curl -X DELETE http://localhost:9200/sw_index-$(date -d -7 days %Y%m%d)配置告警规则/opt/skywalking/apache-skywalking-apm-bin/config/alarm-settings.ymlrules: service_resp_time_rule: metrics-name: service_resp_time op: threshold: 1000 period: 10 count: 3 silence-period: 5 message: Response time of {name} is more than 1s实际部署中我发现SkyWalking 9.x与Elasticsearch 8.x的组合在稳定性上有显著提升但版本匹配非常关键。曾经因为混用8.x和9.x组件导致数据不一致问题花费了大量时间排查。建议严格按照本文推荐的版本组合部署可以避免90%的兼容性问题。