Few days back a new bug came so I did what a good programmer would do. I fixed it.
I did my part of the testing in my development environment. And moved the code to testing environment. The code fix was regarding a schedule task, that used to take file from some location and used to process it line by line. So I started the scheduled task and started to see the logs of test environment. Task was successfully launched that I could see in the logs. But after that nothing was happening means nothing was coming in the logs. So checked the file that was supposed to be processed by the task. It was locked by the task as we were using Camel, so it appended the .lock in the end of the file. The point is file was picked up by the task, but nothing was happening after that. So I thought it is stuck there cause of my changed code. But I was not able to figure out why?
So after that I came to know about the command "kill -3 pid", that was shown by one of my colleague. What it did is dumped the information about all the threads stack trace those were belonging to that process. So I could see that my thread was executing some method. As file was very big, having around half a million record in it.
So after that I started to look what we can do if we get into this kind of situations. What are the tools we have to tackle such problems.
Note:- Threads are a popular programming abstraction for parallel execution on modern operating systems. When threads are forked inside a program for multiple flows of execution, these threads share certain resources (e.g., memory address space, open files) among themselves to minimize forking overhead and avoid expensive IPC (inter-process communication) channel. These properties make threads an efficient mechanism for concurrent execution.
The jps tool lists the instrumented HotSpot Java Virtual Machines (JVMs) on the target system. The tool is limited to reporting information on JVMs for which it has the access permissions.
The jps command will report the local VM identifier, or lvmid, for each instrumented JVM found on the target system. The lvmid is typically, but not necessarily, the operating system's process identifier for the JVM process. With no options, jps will list each Java application's lvmid followed by the short form of the application's class name or jar file name. The jps command uses the java launcher to find the class name and arguments passed to the main method.
-m: Output the arguments passed to the main method. The output may be null for embedded JVMs.
-l: Output the full package name for the application's main class or the full path name to the application's JAR file.
-v: Output the arguments passed to the JVM.
-V: Output the arguments passed to the JVM through the flags file (the .hotspotrc file or the file specified by the -XX:Flags=<filename> argument).
OUTPUT FORMAT
The output of the jps command follows the following pattern:
lvmid [ [ classname | JARfilename | "Unknown"] [ arg* ] [ jvmarg* ] ]
Example : -
So using above tool you will come to know lvmid (that is in most cases is the os pid) of all the VM that is running on that machine.
Now lets dig more using lvmid that we got from the above command.
Now using that lvmid or for discussion pid we can come to know stack trace of all the threads of given process. We can come to know what is our thread is doing. It is in which state? Is it running or sleeping?
jstack:-
jstack prints Java stack traces of Java threads for a given Java process. For each Java frame, the full class name, method name, 'bci' (byte code index) and line number, if available, are printed.
Example:-
Using jps command we will come to know about the pid of our java app. Now we can use the top command to know pid of our threads.
top -H -p 4071
Output:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4091 arham 20 0 2258864 59000 10304 S 6.0 1.6 3:49.35 java
4090 arham 20 0 2258864 59000 10304 S 5.3 1.6 3:45.30 java
So what we see as nid=0x2d6d in jstack output is PID=4091 in top output are the same.
Where nid stands for native id.
So now if we want to see if something is being done our thread 4091 in we will have to keep looking the stack trace using jstack. For this we can use 'watch' command.
Command :-
jstack 4071 | awk '/ nid='"$(printf '%#x' 4091)"' /,/^$/' //Just to print your stack frame.
watch -n .5 jstack 4071 | awk '/ nid='"$(printf '%#x' 4091)"' /,/^$/' //For keep watch your stack frame
Reference:-https://www.pythian.com/blog/a-simple-way-to-monitor-java-in-linux/
I did my part of the testing in my development environment. And moved the code to testing environment. The code fix was regarding a schedule task, that used to take file from some location and used to process it line by line. So I started the scheduled task and started to see the logs of test environment. Task was successfully launched that I could see in the logs. But after that nothing was happening means nothing was coming in the logs. So checked the file that was supposed to be processed by the task. It was locked by the task as we were using Camel, so it appended the .lock in the end of the file. The point is file was picked up by the task, but nothing was happening after that. So I thought it is stuck there cause of my changed code. But I was not able to figure out why?
So after that I came to know about the command "kill -3 pid", that was shown by one of my colleague. What it did is dumped the information about all the threads stack trace those were belonging to that process. So I could see that my thread was executing some method. As file was very big, having around half a million record in it.
So after that I started to look what we can do if we get into this kind of situations. What are the tools we have to tackle such problems.
Note:- Threads are a popular programming abstraction for parallel execution on modern operating systems. When threads are forked inside a program for multiple flows of execution, these threads share certain resources (e.g., memory address space, open files) among themselves to minimize forking overhead and avoid expensive IPC (inter-process communication) channel. These properties make threads an efficient mechanism for concurrent execution.
In Linux, threads (also called Lightweight Processes (LWP)) created within a program will have the same "thread group ID" as the program's PID. Each thread will then have its own thread ID (TID). To the Linux kernel's scheduler, threads are nothing more than standard processes which happen to share certain resources. Classic command-line tools such as ps or top, which display process-level information by default, can be instructed to display thread-level information.
Here are a few things you need to know before starting. Following the links is not necessary, they are available for the reference.
- there are different vendors of Java Virtual Machine. This post is about Oracle’s JVM which is called HotSpot. Linux x86-64 is considered as OS platform. Most of the things about HotSpot are applicable to other vendors too but with slight changes. OSes other than Linux may add some more complications
- it’s called Virtual Machine, because it virtualizes runtime environment for a Java application. So to know where to look at you need to know a few things about how specific VM is organized. For a detailed overview of the HotSpot, please refer to this article
- on Linux, a thread inside HotSpot VM is mapped directly to an OS level thread. Well, it may not be true on all OSes, but for modern Linux kernels this is true. So every thread on the OS level is a thread inside a Java application
- there are generally two types of threads inside a HotSpot VM: native and application threads. Application threads are those that run some Java code, and usually this is what applications are using to run their code. Native threads run something which is not written in Java, usually it’s code in C/C++ and usually all these threads are special utility threads started by a VM itself.
That note is just to make you clear about the thread.
jps(JVM Process Status):-The jps tool lists the instrumented HotSpot Java Virtual Machines (JVMs) on the target system. The tool is limited to reporting information on JVMs for which it has the access permissions.
The jps command will report the local VM identifier, or lvmid, for each instrumented JVM found on the target system. The lvmid is typically, but not necessarily, the operating system's process identifier for the JVM process. With no options, jps will list each Java application's lvmid followed by the short form of the application's class name or jar file name. The jps command uses the java launcher to find the class name and arguments passed to the main method.
-m: Output the arguments passed to the main method. The output may be null for embedded JVMs.
-l: Output the full package name for the application's main class or the full path name to the application's JAR file.
-v: Output the arguments passed to the JVM.
-V: Output the arguments passed to the JVM through the flags file (the .hotspotrc file or the file specified by the -XX:Flags=<filename> argument).
OUTPUT FORMAT
The output of the jps command follows the following pattern:
lvmid [ [ classname | JARfilename | "Unknown"] [ arg* ] [ jvmarg* ] ]
Example : -
jps
4071 com.analyzer.thread.JobRunner
18032 jps
18005 jstat
So using above tool you will come to know lvmid (that is in most cases is the os pid) of all the VM that is running on that machine.
Now lets dig more using lvmid that we got from the above command.
Now using that lvmid or for discussion pid we can come to know stack trace of all the threads of given process. We can come to know what is our thread is doing. It is in which state? Is it running or sleeping?
jstack:-
jstack prints Java stack traces of Java threads for a given Java process. For each Java frame, the full class name, method name, 'bci' (byte code index) and line number, if available, are printed.
Example:-
jstack 4071
Output:-2016-05-15 00:42:02 "t2" prio=10 tid=0x00007f67200ab800 nid=0x2d6e waiting on condition [0x00007f6713af9000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.analyzer.thread.Job.run(Job.java:8) at java.lang.Thread.run(Thread.java:745) "t1" prio=10 tid=0x00007f67200aa000 nid=0x2d6d waiting on condition [0x00007f6713bfa000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.analyzer.thread.Job.run(Job.java:8) at java.lang.Thread.run(Thread.java:745)
top:-
Using jps command we will come to know about the pid of our java app. Now we can use the top command to know pid of our threads.
top -H -p 4071
Output:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4091 arham 20 0 2258864 59000 10304 S 6.0 1.6 3:49.35 java
4090 arham 20 0 2258864 59000 10304 S 5.3 1.6 3:45.30 java
So what we see as nid=0x2d6d in jstack output is PID=4091 in top output are the same.
Where nid stands for native id.
So now if we want to see if something is being done our thread 4091 in we will have to keep looking the stack trace using jstack. For this we can use 'watch' command.
Command :-
jstack 4071 | awk '/ nid='"$(printf '%#x' 4091)"' /,/^$/' //Just to print your stack frame.
watch -n .5 jstack 4071 | awk '/ nid='"$(printf '%#x' 4091)"' /,/^$/' //For keep watch your stack frame
Reference:-https://www.pythian.com/blog/a-simple-way-to-monitor-java-in-linux/
No comments:
Post a Comment