Thread/Heap dumping using MBean Toolkit

8 Sep

Introduction

Thread and heap dumps can be used to analyse JVM concurrency and performance issues

Thread dump:
A file dump that shows all live threads currently living in the JVM

Heap dump:
A file dump that physically contains all objects on the JVM heap (usually large files as it is a snapshot of the heap itself)

Taking Heap/Thread dumps remotely using MBean Toolkit

The jar file at the github repository https://github.com/techshare1337/MBeanToolkit contains a jar file which is an app that can be launched as a normal executable. This app will allow you to take heap/thread dumps from your local PC by connecting to a JVM on another machine. In order to connect to a JVM on another machine the following flags need to be set on that JVM which can be done by following the guide below ‘Extracting the heap dump from the server’. The source code for this project can be checked out at https://github.com/techshare1337/MBeanToolkit.

-Dcom.sun.management.jmxremote # allow remote jmx connections
-Dcom.sun.management.jmxremote.port=8802 # port to connect on (doesn't have to be 8802)
-Dcom.sun.management.jmxremote.ssl=false # disable ssl authentication
-Dcom.sun.management.jmxremote.authenticate=false # set authenticate to false
-Djava.rmi.server.hostname=volrd10.int.corp.sun # set hostname to hostname or ip of machine that JVM is hosted on
  • Connect to JVM using [hostname]:[port] e.g. ‘voldrd10:8802’ from example above
  • To take a heap dump
    • Give the heap dump a filename and click heapdump. Heap dump should be named with filename + timestamp details and located in JVM directory on the remote server depending on JVM flags for heap dump output
  • To take a thread dump
    • Click thread dump. Application should collect dump info returned and output to file in same directory as the toolkit

Using the scheduler:
The scheduler can be used to take dumps at a set time, on a set repeation period and for a set duration. This is useful for diagnosing issues that can occur from the build up of something overtime.
To do this, check the scheduler checkbox and enter in the start delay (time to wait for a dump to be taken), repeat period (time to wait before taking another dump), duration (period to keep taking dumps for.
E.g. start delay=3600, repeat every=60, duration of=1800
Will wait for 1 hour to take a dump and repeat taking a dump every minute after this for half an hour.
Finally click the dump button to start the scheduler.

Extracting the Heap Dump from the server

1. On the machine with said heap dump, login using PuTTY and use the command 

sudo su - [superuser]

this will then allow you to modify permissions which you will need to apply to modify and extract files. The following command will give you full permissions on the ‘wsdev’ group which you should be added to from the previous step and applies to the directory where heap dumps and settings for them are set.

chmod -Rf 774 /apps/tc6

Setting the flags for remote JMX connections:

1. Using WinSCP, go to the directory /apps/tc6/[nameOfAppForHeapDump]/script

2. Modify flags to be as shown in the tutorial above in the file which contains flags or the config. The one I have noticed that always needs to be changed (from true to false) is

-Dcom.sun.management.jmxremote.authenticate=false

Also, if you want a heap dump to be invoked on out of memory errors add the following flag to the file

-XX:+HeapDumpOnOutOfMemoryError"

so would be 

CATALINA_OPTS="$CATALINA_OPTS -XX:+HeapDumpOnOutOfMemoryError"

You may get an error that reads something along the lines of ‘timestamp cannot be modified’ after saving, just skip this message. Alternatively go to WinSCP preferences and check the box for ‘Ignore permission errors’ to suppress this warning from now on as shown below.

You must now stop and start the jvm again to apply the changes.

Extracting heap dumps

Drag and drop the heap dump file to your local pc which should be able to be done if you’ve set the permissions correctly

Garbage Collection Analysis with HPJmeter

2 Sep

Introduction

HPJmeter is a tool that can be used to analyse garbage collection logs. HPJmeter is a useful tool to assist in identifying JVM issues.

How to use

  1. Download the tool from here to your machine
  2. Execute the jar file to open the application. Retrieve the garbage collection log (gc.log)  and save it somewhere to be opened by HPjmeter
  3. Go to Open File and select the gc file. You should get something that looks like this:

hpjmeter1

The summary view shows the overall statistics to do with GC and the heap. The screenshot below is the other main view ‘Heap Usage After GC’:hpjmeter2

This shows heap usage after GC and the garbage collection details. Pink indicates the JVM going into full GC.

General Trends

Typical

The below is an example of a typical/expected graph where memory increases but then reaches a point where heap usage drops again and this continues forming a ‘zig-zag’ or ‘sawtooth’ type patternhpjmeter3

Memory leak indicative

The below is an example indicative of memory issues where garbage collection goes and stays into full GC never able to release objects to decrease memory usage. When this happens there should be out of memory errors visible in the logs themselves.hpjmeter4

Script for use on Linux machines to take thread dumps over time for a java application

26 Aug
Usage

To use, simply go

./scriptName.sh {jstack|signal} <pid> <number of dumps> <interval (seconds)>
E.g:
./scriptName.sh signal 23524 3 5

This will take 3 thread dumps 5 seconds apart (a total of 3 dumps in 15 seconds) for the process id 23524. To find out the ‘pid’ use top/ps or similar.

Use signal if you don’t have jstack installed.

The Script

#! /bin/bash

#
# This script will cause a java application to drop a thread
# dump to it's log every <interval (seconds)> for <number of dumps> times

case "$1" in
	jstack)
		CMD=jstack
	;;
	signal)
		CMD="kill -QUIT"
	;;
	*)
		echo "Usage: threadDumpsOverTime.sh {jstack|signal} <pid> <number of dumps> <interval (seconds)>"
		exit 1;
esac

i=0
while [ $i -lt $3 ]
do
	$CMD $2
	sleep $4
	let i=$i+1
done

Sysstat script

26 Aug

What it is for

This is a script that can be run to retrieve resource utilization statistics from a specified time range from a linux machine with sysstat installed. This script is shown at the bottom of this post.

How to use

You will first need to install sysstat on the machine you wish to gather performance statistics from. This can be retrieved here. You will also need to install openssh and gnuplot.

Seven arguments supplied (plus the command name itself):

./get_sysstats.sh dirname dayofmonth sthr stmin endhr endmin threadcount
  • dirname – is created if it doesn’t already exist
  • dayofmonth – day of the month (dd) with leading zero if necessary
  • sthr – start hour of test period
  • stmin – start minute of test period
  • endhr – end hour of test period
  • endmin – end minute of test period
  • threadcount – Number of threads tested for this configuration if performance testing (used for presentation purposes only, use 1 by default)
  • The dayofmonth has values ranging from 01 to 31
  • The start and end hours have a value ranging from 00 to 23
  • The start and end minutes have a value ranging from 00 to 59

To specify a start time of 9:05 and an end time of 10:05 on the
first day of this month you must say something like:

./[scriptName].sh dirname 01 09 05 10 05 1
  • Change ‘appServer’, ‘username’ and ‘sar_logfile’ variables to the values of the machine you wish to capture statistics from
  • After running the script you will be asked to enter your password for the set username 4 times. If you want to avoid doing this you can set up ssh and sshd for non-interactive login.

Example output

sysstatoutput.png

The Script


#!/bin/bash

# Seven arguments supplied (plus the command name itself):
# ./get_sysstats.sh dirname dayofmonth sthr stmin endhr endmin threadcount
#
# dirname - is created if it doesn't already exist
# dayofmonth - day of the month (dd) with leading zero if necessary
# sthr - start hour of test period
# stmin - start minute of test period
# endhr - end hour of test period
# endmin - end minute of test period
# threadcount - Number of thread (pairs) tested for this configuration
#
# The dayofmonth has values ranging from 01 to 31
# The start and end hours have a value ranging from 00 to 23
# The start and end minutes have a value ranging from 00 to 59
#
# To specify a start time of 9:05 and an end time of 10:05 on the
# first day of this month you must say something like:
#
# ./get_sysstats.sh dirname 01 09 05 10 05 1
#
# Change 'appServer', 'username' and 'sar_logfile' variables to required values
#
#

E_BADARGS=65

if [ $# -ne 7 ]
then
  echo "Usage: `basename $0` dirname dayofmonth sthr stmin endhr endmin threadcount"
  echo ""
  echo "The time specified is a local time but the date is effectively UTC."
  echo "This means that depending on the time of day if you are collecting"
  echo "today's statistics you may need to specify the day of the month as"
  echo "today or yesterday based on the time of day."
  echo ""
  echo "To get an idea of the date you'll need to specify use date -u."
  exit $E_BADARGS
fi

destdir=$1
dayofmonth=$2
sthr=$3
stmin=$4
endhr=$5
endmin=$6

sth=$sthr
stm=$stmin
endh=$endhr
endm=$endmin
threadcount=$7

if [ `echo $sthr|cut -b 1` == 0 ]; then sth=`echo $sthr|cut -b 2`; fi
if [ `echo $endhr|cut -b 1` == 0 ]; then endh=`echo $endhr|cut -b 2`; fi
if [ `echo $stmin|cut -b 1` == 0 ]; then stm=`echo $stmin|cut -b 2`; fi
if [ `echo $endmin|cut -b 1` == 0 ]; then endm=`echo $endmin|cut -b 2`; fi

threadcount=$7
xinterval=$(((($endh - $sth)*3600+($endm-$stm)*60)/6))
# The date of test is concatenated to the day of month to create the test date for which data is to be collected
dateoftest=`date +%m-%Y`
fullmachinename=`uname -n`
nodename=${fullmachinename%}
appServer=p5b6-ubu10 # Set your applications server here
username=admin # Set your username here

if [ ! -d $destdir ]
then
  echo "Creating directory: ${destdir}"
  mkdir -p $destdir
fi

# Centos
#sar_logfile=/var/log/sa/sa${dayofmonth}

# Ubuntu
sar_logfile=/var/log/sysstat/sa${dayofmonth} # Set your log file directory here

timestamp=${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}
cpu_report=${destdir}/${appServer}_cpu_${timestamp}.csv
disk_report=${destdir}/${appServer}_dsk_${timestamp}.csv
memory_report=${destdir}/${appServer}_mem_${timestamp}.csv
network_report=${destdir}/${appServer}_net_${timestamp}.csv
echo "date of test: ${dayofmonth}-${dateoftest}"
echo "nodename: ${nodename}"
echo "Start time: ${sthr}:${stmin}"
echo "End time: ${endhr}:${endmin}"

# Get statistics application server

# sar stats for CPU
ssh ${username}@${appServer} sadf -D -s ${sthr}:${stmin}:00 -e ${endhr}:${endmin}:00 -- -u ${sar_logfile} > ${cpu_report}
# sar stats for disk I/O
ssh ${username}@${appServer} sadf -D -s ${sthr}:${stmin}:00 -e ${endhr}:${endmin}:00 -- -b ${sar_logfile} > ${disk_report}
# sar stats for memory
ssh ${username}@${appServer} sadf -D -s ${sthr}:${stmin}:00 -e ${endhr}:${endmin}:00 -- -r ${sar_logfile} > ${memory_report}
# sar stats for network
ssh ${username}@${appServer} sadf -D -s ${sthr}:${stmin}:00 -e ${endhr}:${endmin}:00 -- -n DEV ${sar_logfile} > ${network_report}

# Separate the ethernet and loopback interface results from the network statistics file
ethnet_report=${destdir}/${appServer}_ethnet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv
lonet_report=${destdir}/${appServer}_lonet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv
grep [^p]eth0 ${network_report} > ${ethnet_report}
grep lo ${network_report} > ${lonet_report}

# Remove first line from cpu, mem and disk statistics files
sed --in-place=.orig -e '1d' ${cpu_report}
sed --in-place=.orig -e '1d' ${memory_report}
sed --in-place=.orig -e '1d' ${disk_report}

# Create a configuration file to be used to drive gnuplot using a here-document
(
cat <<EOF
# Gnuplot configuration for performance graphs
#
# Render Performance graphs
# Network
#
set datafile separator ";"
set terminal png size 1024,768
set out "${destdir}/${threadcount}tp-application-performance-${dayofmonth}-${dateoftest}-${sthr}${stmin}-${endhr}${endmin}.png"
set multiplot layout 2,2 title "Application Server ${appServer} (${dayofmonth}-${dateoftest} ${sthr}:${stmin} - ${endhr}:${endmin})"
set title "Network Tx/Rx Bytes per Second (${threadcount} Threads)"
set grid mxtics
set grid xtics
set grid mytics
set grid ytics
set xdata time
set lmargin 8
set rmargin 3
set timefmt "%s"
set format x "%l:%M"
set xtics $xinterval
plot "${destdir}/${appServer}_ethnet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv" using 3:8 title "eth0 Tx bps" with lines, \
     "${destdir}/${appServer}_ethnet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv" using 3:7 title "eth0 Rx bps" with lines, \
     "${destdir}/${appServer}_lonet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv" using 3:8 title "lo Tx bps" with lines, \
     "${destdir}/${appServer}_lonet_${dayofmonth}-${dateoftest}_${sthr}${stmin}-${endhr}${endmin}.csv" using 3:7 title "lo Rx bps" with lines

# CPU
set title "CPU Loading (${threadcount} Threads)"
plot "${cpu_report}" using 3:5 title "% User" with lines, \
     "${cpu_report}" using 3:7 title "% System" with lines

# Disk I/O
set title "I/O Loading (${threadcount} Threads)"
plot "${disk_report}" using 3:4 title "TPS" with lines, \
     "${disk_report}" using 3:5 title "Read TPS" with lines, \
     "${disk_report}" using 3:6 title "Write TPS" with lines, \
     "${disk_report}" using 3:7 title "Block reads/s" with lines, \
     "${disk_report}" using 3:8 title "Block writes/s" with lines

# Memory
set title "Memory Usage (${threadcount} Threads)"
plot "${memory_report}" using 3:4 title "kb mem free" with lines, \
     "${memory_report}" using 3:5 title "kb mem used" with lines, \
     "${memory_report}" using 3:7 title "kb buffers" with lines, \
     "${memory_report}" using 3:8 title "kb cached" with lines, \
     "${memory_report}" using 3:9 title "kb committed" with lines
unset multiplot


EOF
) > performancegraphs.gp
# End of here-document

gnuplot performancegraphs.gp

rm performancegraphs.gp

SQL script for dropping all user created objects (MSSQL, Oracle)

26 Aug

Below is an SQL script to remove a range of user created objects in the current database (tables, views, procedures etc…). Has been tested to work
in MSSQL. Will probably work in Oracle as well but not tested. Use at your own risk.


declare @n char(1)
set @n = char(10)

declare @stmt nvarchar(max)

-- procedures
select @stmt = isnull( @stmt + @n, '' ) +
'drop procedure [' + object_schema_name( object_id ) + '].[' + name + ']'
from sys.procedures
--select @stmt
exec sp_executesql @stmt

-- check constraints
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'alter table [' + object_schema_name( parent_object_id ) + '].[' + object_name( parent_object_id ) + '] drop constraint [' + name + ']'
from sys.check_constraints
--select @stmt
exec sp_executesql @stmt

-- functions
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'drop function [' + object_schema_name( object_id ) + '].[' + name + ']'
from sys.objects
where type in ( 'FN', 'IF', 'TF' ) and object_schema_name( object_id ) <> 'ExtProps'
--select @stmt
exec sp_executesql @stmt

-- views
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'drop view [' + object_schema_name( object_id ) + '].[' + name + ']'
from sys.views where object_schema_name( object_id ) <> 'ExtProps'
--select @stmt
exec sp_executesql @stmt

-- foreign keys
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'alter table [' + object_schema_name( parent_object_id ) + '].[' + object_name( parent_object_id ) + '] drop constraint [' + name + ']'
from sys.foreign_keys
--select @stmt
exec sp_executesql @stmt

-- tables
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'drop table [' + object_schema_name( object_id ) + '].[' + name + ']'
from sys.tables
--select @stmt
exec sp_executesql @stmt

-- user defined types
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'drop type [' + name + ']'
from sys.types
where is_user_defined = 1
--select @stmt
exec sp_executesql @stmt

-- Schemas
select @stmt = ''
select @stmt = isnull( @stmt + @n, '' ) +
'drop schema [' + name + ']'
from sys.schemas
where schema_id between 5 and 1000
--select @stmt
exec sp_executesql @stmt

Eclipse Memory Analyzer

24 Jul

Eclipse Memory Analyzer

Introduction

Eclipse Memory Analyzer is used to analyze heap dumps taken from the targetted JVM. Eclipse Memory Analyzer is a useful tool to identify potential memory leaks and areas of the application which are memory intensive.

How to use

  1. Extract the folder contained in downloaded zip file
  2. Open up the file mat/MemoryAnalyzer.ini in a text editor and change ‘-vmargs-Xmx1024’ to the desired value which depends on how large the heap you intend to open is. E.g. if your max heap size is set to 2048mb you should change the value to ‘-vmargs-Xmx2048’. Although you might have to allow for some leeway (-vmargs-Xmx2500) as this dumps can usually go over the max limit when an out of memory error occurs
  3. Start the application (mat/MemoryAnalyzer.exe)
  4. Go to file > open heap dump. Eclipse memory analyzer recognizes the extension .hprof as heap dumps so if your dump does not have this extension, select All Files on the dropdown
  5. Open up the heap dump, you will see a dialog that shows the progress of the opening of the heap dump, this may take a while depending on how large your heap dump is
  6. After it has been opened you will see a dialog like this:
  7. Choosing the option Leak Suspects Report will give you an overview of what objects occupy a large portion of the heap so this is a good place to start. You will get something that looks like the following. As shown in the screenshot below we have a high level picture of potential memory leaks or inefficient memory usage due to instances of “org.apache.tomcat.util.threads.TaskThread”. It appears these instances occupy ~45% of the heap:

Another useful thing you can do to drill down further to the root cause of this problem is to click the icon ‘OQL’ as shown below.

This will open up a new tab which you can then use to make queries on the problem classes. As shown below, the query selects all objects of the specified class and displays them in a tree-like structure. Clicking on the heading ‘Retained Heap’ will allow you to sort by descending usage of memory which is probably a more useful sequence to view these objects in. You can then expand the nodes to gain further insight to where most of the memory has been allocated. The panel on the left-hand side is useful as it shows the values of each of the object fields.

Google codejam Qualification Round Africa and Arabia 2011 Problem A. Closing the Loop

23 Jul

Problem

Given a bag full of rope segments, you will build the longest loop of rope while alternating colors. The bag contains S segments and each segment will either be blue (B) or red (R). You are required to alternate between colors and because of this requirement you might not use every segment in the bag. If you only have segments of a single color, you will not be able to tie any knots and should output 0. Each segment length is provided in centimeters and each knot in the loop consumes one centimeter of length from the loop. In other words, a knot consumes one-half of a centimeter from of the two segment it connects.

Note that pieces of string that have length 1, if used in making the cycle, might get reduced to just a pair of knots of total length 0. This is allowed, and each such piece counts as having been used.

Input

The first line of input gives the number of cases, N.
N test cases follow. For each test case there will be:

  • One line containing the value S, the number of rope segments in the bag.
  • One line containing a space separated list of S values. Each value L indicates the segment length in centimeters followed by the letter B or R to indicate the segment color.

Output

For each test case, output one line containing “Case #x: ” followed by the maximum length of the rope loop that can be generated with the rope segments provided.

Limits

1 ≤ number of rope segments (S) ≤ 1000
1 ≤ length of a rope segment (L) ≤ 100

Small dataset

N ≤ 5

Large dataset

N ≤ 50

Sample

Input Output
4
1
5B
4
6R 1B 7R 3B
7
5B 4R 3R 2R 5R 4R 3R
2
20B 20R
Case #1: 0
Case #2: 13
Case #3: 8
Case #4: 38

Solution

from sys import stdin

f = open('output.out', 'w')
for i in range(0, int(stdin.readline())):
stdin.readline()
segments = stdin.readline().split()

blue = []
red = []

for j in segments:
if j[len(j)-1] == 'B':
blue.append(int(j[0:len(j)-1]))
else:
red.append(int(j[0:len(j)-1]))

if len(blue) == 0 or len(red) == 0:
print 'Case #'+str(i+1)+': 0'
f.write('Case #'+str(i+1)+': 0\n')
continue

blue = sorted(blue, reverse = True)
red = sorted(red, reverse = True)

sum = 0
segUsed = 0

length = 0
if len(blue) > len(red):
length = len(red)
else :
length = len(blue)

for k in range(0, length):
sum += blue[k]
sum += red[k]
segUsed += 2

print 'Case #'+str(i+1)+': ' + str(sum-segUsed)
f.write('Case #'+str(i+1)+': ' + str(sum-segUsed)+'\n')

f.close()

Input

change extension to .txt