Wednesday, October 12, 2011

Resolving Too many files open error in Linux


Situation:
Users couldn't execute *ANY* commands. It gives "Too many open files" error as shown below:

orabi@miaash02-t1$ top
ksh: top: /usr/bin/top: cannot execute [Too many open files in system]
dwilliams@miaash02-t1$ ls -l
ksh: ls: /bin/ls: cannot execute [Too many open files in system]


Reason:
Everything in Linux are files; Linux forks most things including devices, sockets and pipes as files.  There is a kernel parameter called “file-max” which controls the maximum number of files that can be opened in a system. The default value is 65K (approx), can be find using the following command:
“sysctl -a | grep file-max”.
To check the count of number of files open, we can use the following command: “lsof | wc –l”. However this will not give you the exact number, because it is possible for a single file to be opened multiple times for readind and each additional concurrent open will increase the count for file-max value.  And in addition even connections to network ports can eat up the ‘file-max’ value.

Resolution:
As a temporary solution, we can increase the ‘file-max’ value by issuing the following command:
# echo “value” > /proc/sys/fs/file-max and then we need to identify the problem by analyzing either the System logs or by using lsof command itself with appropriate options.