IT Panda Blog

Life is fantastic


  • Home

  • Tags

  • Categories

  • Archives

Elastic Search Too Many Open Files Error

Posted on 2019-12-15 In elasticsearch

线上ES突然报错too many open files

首先使用如下命令查看es cluster当前file descriptor status

1
2
3
4
5
6
7
curl -XGET '127.0.0.1:9200/_cat/nodes?v&h=ip,fdc,fdm'
ip fdc fdm
100.85.71.6 103828 128000
100.85.71.7 109436 128000
100.85.71.9 95884 128000
100.85.71.8 101105 128000
100.85.71.10 103331 128000

概念:
fdc: file_desc.current, fdc, fileDescriptorCurrent
fdm: file_desc.max, fdm, fileDescriptorMax

先解决问题

in crease file descriptor limit

  • Display the current hard limit of your machine: The hard limit is the maximum server limit that can be set without tuning the kernel parameters in proc file system
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ ulimit -aH
core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 256940
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 128000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) unlimited
cpu time (seconds, -t) unlimited
max user processes (-u) 65535
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
  • Edit the /etc/security/limits.conf

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    $ sudo cat /etc/security/limits.conf | grep -v '#'

    * soft nofile 128000
    * hard nofile 128000
    * soft nproc 65535
    * hard nproc 65535

    es soft memlock unlimited
    es hard memlock unlimited
    <domain> <type> <item> <value>
  • Need to reboot to take effect?

  • Add parameter to ES during startup: -Des.max-open-files=true

问题解决后在想Root Cause

File Descriptor fd: Linux有个概念是一切皆文件,所以Linux内的操作都是各式各样的文件操作,但是操作文件不能总是从头找吧,所以类似于数据库的索引,Linux对所有的文件也进行了索引,这个索引就是file descriptor

fd的结构基本就是,索引号,然后指向file的指针, 注意,fd只是针对单一一个process的

而ES会使用大量的文件

Lucene uses a very large number of files. At the same time, Elasticsearch uses a large number of sockets to communicate between nodes and HTTP clients. All of this requires available file descriptors.

Sadly, many modern Linux distributions ship with a paltry 1,024 file descriptors allowed per process. This is far too low for even a small Elasticsearch node, let alone one that is handling hundreds of indices.

You should increase your file descriptor count to something very large, such as 64,000. This process is irritatingly difficult and highly dependent on your particular OS and distribution. Consult the documentation for your OS to determine how best to change the allowed file descriptor count.

所以要增大这个值…

除此之外,ES shard: primary and replica 也是导致ES open file 特别多的原因

ES performance tuning, verg good

另外需要控制index的life cycle,使用index template配合rollup或者delete等操作

elasticsearch
Kubernetes Namespace Stuck in status Terminating
Java单例模式double check locking在JDK1.5之前的问题
  • Table of Contents
  • Overview
Rex

Rex

25 posts
26 categories
49 tags
Links
  • GitHub
  1. 1. 先解决问题
  2. 2. 问题解决后在想Root Cause
© 2019 – 2020 作者拥有版权,转载请注明出处