Sunday, December 21, 2014

Shell Scripting : Why closing a terminal kills all the programs running in it?

When a terminal is closed, it is sent SIGHUP signal. Before terminal exits, it sends SIGHUP to all jobs, which are either running or stopped in it. Stopped jobs are sent SIGCONT to ensure that they receive the SIGHUP.  

To prevent the shell from sending a SIGHUP signal to a particular job, the job should be removed from the jobs table using disown builtin command.

Shell Scripting : Why directory change can't be made in a separate process?

While executing cd command, if a new process is forked as a child process of the current shell, then after the cd command execution is completed, the control would revert back to the parent process(the current shell) and the original directory would be restored back. Hence, it would be impossible to change directories.

Hence cd command is executed within the current shell process itself.

Shell scripting : Shell Builtin commands

Shell builtin commands do not fork(spawn) a new process and these commands are executed in the current shell itself.

Good examples are the commands like cd, echo, variable assignment like a=2.
These commands when executed do not generate a new process, and are executed directly in the current shell itself. Hence, these commands execute much faster as there is no overhead of spawning new process. 

Shell Scripting : Length of a string

In shell scripting, to find the length of a string

#!/bin/bash
line="to do"

echo "Length of string is ${#line}

Shell Scripting : Integer comparison

integer comparison
-eq
is equal to
if [ "$a" -eq "$b" ]
-ne
is not equal to
if [ "$a" -ne "$b" ]
-gt
is greater than
if [ "$a" -gt "$b" ]
-ge
is greater than or equal to
if [ "$a" -ge "$b" ]
-lt
is less than
if [ "$a" -lt "$b" ]
-le
is less than or equal to
if [ "$a" -le "$b" ]
<
is less than
(("$a" < "$b"))
<=
is less than or equal to
(("$a" <= "$b"))
>
is greater than
(("$a" > "$b"))
>=
is greater than or equal to (within double parentheses)
(("$a" >= "$b"))

Shell Scripting : AND, OR operators

To use AND, OR operators in shell scripting, there are two ways to do this

1) -a for AND
    -o for OR

the usage is  [ exp1 -a exp2 ] , [ exp1 -o -exp2 ]
Single square bracket for using -a, -o

2) && for AND
    ||      for OR

the usage is [[ exp1 && exp2 ]] , [[exp1 || exp2]]
Double square bracket for using &&, ||


Eg1:

#!/bin/bash

a=4
b=5

if [ "$a" -ne "$b" -a "$a" -lt "$b" ]
then
   echo "$a is not equal to $b"

fi

Eg2:

#!/bin/bash

a=4
b=5

if [[ "$a" -ne "$b" && "$a" -lt "$b" ]]
then
   echo "$a is not equal to $b"
fi

Thursday, December 18, 2014

Shell Scripting : How to find the value of the last argument passed to a shell script?

Curly Bracket notation {} for positional parameters helps in identifying the value of the last argument passed to a shell script

#!/bin/bash
#argcheck.sh

args=$#          #Specifies the number of arguments passed to the script

lastarg=${!args} #Specifies the value of the last argument passed to the script

# lastarg=${!#} is another way to get the last argument value

echo $args

echo $lastarg

If the above script is run as

./argcheck.sh 10 20 30

3  - Specifies the number of arguments passed to the function
30 - Specifies the value of the last argument

Shell Scripting : How to specify positional parameters greater than 9?

In shell scripting, the positional parameters are specified by $0, $1, $2, $3, $4, $5, $6, 47, $8, $9

To specify positional parameters greater than 9, it can done by enclosing in curly brackets as follows ${10}, ${11} and so on.

$0 is the name of the script itself, $1 is the first argument, $2 is the second argument and so on

Shell Scripting : How to store output of a command to a variable in shell script?

To store the output of a command to a variable in shell script(also called as command substitution),  there are two methods possible

1) Using backquotes ``
2) Using $( )

Let us see how

# arch=`uname -m`
# echo $arch
x86_64

# arch=$(uname -m)
# echo $arch
x86_64

Wednesday, December 17, 2014

Shell Scripting : > &> >&

scriptname >filename redirects the output of scriptname to file filename. Overwrite filename if it already exists.

command &>filename redirects both the stdout and the stderr of command to filename.

command >&2 redirects stdout of command to stderr.

Tuesday, December 16, 2014

Shell Scripting : How to truncate a file to zero length, without changing permissions?

To truncate a file to zero length without changing permissions, here are the options

1) cp /dev/null file.xxx
2) : > file.xxx

In the second option involving colon(:), the : being a shell built-in command, no new process is forked.

Monday, December 15, 2014

Shell Scripting: What if shebang(#!) is followed by path of the command /bin/rm i.e #! /bin/rm

What will happen if the following script, test.sh, is executed?

#!/bin/rm

echo "Hi"

./test.sh when executed, will remove the file test.sh

Shell scripting : What if there are two shebang(#!) lines in the shell script?

Suppose if there are two she-bang (#!) lines in a shell script

#!/bin/bash

echo "First line"

#!/bin/csh

echo "Second line"

When we run the above script, will /bin/bash be the interpreter for executing the command echo "First Line"  and later  /bin/csh be the interpreter for executing the command echo "Second line"?

No. Only the first line is #!/bin/bash is the interpreter for the whole script. The second interpreter beginning with # is considered as a comment. Any line beginning with # following the first shebang line is considered as a comment.


Shell Scripting : Which gets executed - shell specified in shebang(#!) or shell specified in command line

Suppose I have an executable script, test.sh as follows

#!/bin/bash

readlink /proc/$$/exe

When I run the above script as ./test.sh, then I know that bash shell executes the script. Then the output of above script is

$ ./test.sh
/bin/bash

If I run the above script as follows

$ /bin/csh test.sh

Which shell will execute the script - csh(specified in the command line) or bash(specified in the shebang)?

The output of the the above command is /bin/csh.

So why does /bin/csh execute the script instead of /bin/bash specified in the script?

Here is the answer,

Whenever a shell script is executed, a new shell(process) is spawned(forked) and the commands in the scripts are executed. The child shell becomes the parent process of commands in the script.
The shebang, #!, is used by the exec() family of functions to determine whether the file to be executed is a binary or script. If the shebang(#!) is present, then exec() will run the executable specified after the shebang - in the above case it is /bin/bash. she-bang is actually a two byte magic number that designates a file type. Immediately following the she-bang(#!) is the path name - this is the path to the program that interprets the commands in the script whether it be a shell, programming language or a utility. This command interpreter then executes the commands in the script starting at the top(following the shebang #! line), ignoring commands.

Now if you run the script by specifying the shell in the command line, like /bin/csh test.sh, the exec() will execute the shell interpreter, /bin/csh specified in the command line instead of looking into the script for shell interpreter specified in shebang line.


Thursday, December 11, 2014

SIGNALS: Use Signal Names or Signal Numbers with KILL command?

While passing signals to kill command, it is always better to use signal names rather than their numbers because the same signal is sometimes represented by different names on SVR4 and BSD systems.

Wednesday, December 10, 2014

How to copy a specific file type keeping the folder structure intact in the destination directory

By using cpio command in pass through mode, files can be copied to the destination directory with the same path structure as in the source directory

find /source_path/of/files -name "<regex_of_filetype>" | cpio -pdm /destination_directory_path

Eg: To copy all files of type "*.sh" in /home/bob to /tmp/jim, keeping the source directory path of files in destination intact


find /home/bob -type f -name "*.sh" | cpio -pdm /tmp/jim

How to find the version of cloudera Hadoop

Read the following file

# cat /usr/lib/hadoop/cloudera/cdh_version.properties

# Autogenerated build properties
version=2.0.0-cdh4.1.2
git.hash=f0b53c81cbf56f5955e403b49fcd27afd5f082de
cloudera.hash=f0b53c81cbf56f5955e403b49fcd27afd5f082de
cloudera.base-branch=cdh4-base-2.0.0
cloudera.build-branch=cdh4-2.0.0_4.1.2
cloudera.pkg.version=2.0.0+552
cloudera.pkg.release=1.cdh4.1.2.p0.27
cloudera.cdh.release=cdh4.1.2
cloudera.build.time=2012.11.02-00:01:31GMT

cloudera.pkg.name=hadoop

The Hadoop version installed is 2.0.0-cdh4.1.2, which means that Cloudera version installed is based on Apache Hadoop 2.0.0 release.

Friday, August 1, 2014

Hadoop Namenode - Image, Checkpoint(fsimage), Journal(edits)

Image

HDFS Namenode keeps the entire namespace in RAM. The inode data and list of blocks belonging to each file comprise the metadata of the name system called the image.

Checkpoint (fsimage)

The persistent record of the image stored in the Namenode's native fielsystem is called a checkpoint.The locations of block replicas may change over time and are not part of the persistent checkpoint. checkpoint also called fsimage(filesystem image).

The fsimage file contains a serialized form of all the directory and file inodes in the filesystem. Each inode is an internal representation of a file or directory’s metadata and contains such information as the file’s
replication level, modification and access times, access permissions, block size, and the blocks a file is made up of. For directories, the modification time, permissions, and quota metadata is stored.

Note: The fsimage file does not record the datanodes on which the blocks are stored. Instead the namenode keeps this mapping in memory, which it constructs by asking the datanodes for their block lists when they join the cluster and periodically afterward to ensure the namenode’s block mapping is up-to-date.

Journal(edit log)

The NameNode also stores the modification log of the image called the journal in the local host’s native file system. journal also called edits. When a filesystem client performs a write operation (such as creating or moving a file), it is first recorded in the edit log. The namenode also has an in-memory representation of the filesystem metadata, which it updates after the edit log has been modified. The in-memory metadata is used to serve read requests.

Upon namenode startup, the fsimage file is loaded into RAM and any changes in the edits file are replayed, bringing the in-memory view of the filesystem up to date

How HDFS is different from traditional filesystems like ext3?


  1. Traditional filesystems like ext3 are implemented as kernel modules. HDFS is a userspace filesystem - filesystem code runs outside kernel as OS process and is not registered or exposed via the Linux VFS layer.
  2. Traditional filesystems need to mounted. HDFS filesystems need not be mounted, as it just runs as a OS process.
  3. HDFS is ditributed filesystem - distributed across many machines. So size of a HDFS file is not limited by the machine capacity. In traditional filesystems, file size cannot exceed the disk space capacity of the machine.
  4. Traditional filesystems use block size of 4KB or 8KB. HDFS uses larger block size of 64MB by default.
  5. Unlike conventional file systems, HDFS provides an API that exposes the locations of a file blocks. This allows applications like the MapReduce framework to schedule a task to where the data are located, thus improving the read performance.

Thursday, July 31, 2014

Friday, July 25, 2014

Tools for benchmarking Hard Disk performance(File system Read Write Performance)

dd

dd command can be used to read and write from any block device in Linux.
Using dd command we can perform simple sequential read and sequential write tests

Using dd command to check sequential write speed by writing a 1GB file to disk

dd if=/dev/zero of=/tmp/outputfile bs=1M count=1000

But the above test has a drawback.Any data written to disk is first cached in memory and then written to the disk. So in the above test, it gives the speed at which data was cached into RAM and not the speed at which data was written to the disk. So the above command is tweaked as follows

dd if=/dev/zero of=/tmp/outputfile bs=1M count=1000 conv=fdatasync

So now the dd command will report the write speed only after the data is synced to the disk

Alternative way,

  sync;time bash -c "(dd if=/dev/zero of=/tmp/outputfile bs=1M count=1000; sync)"

Using dd command to check sequential read speed

   dd if=/tmp/outputfile of=/dev/null bs=1M count=1000

hdparm 

It is a performance and benchmarking tool for SATA/IDE drives.

To measure how many MB/s, the hard drive can read
     hdparm -t --direct /dev/sda1
     (or) hdparm -tT /dev/sda1

  -t  : read from cache buffer
  -T : speed of reading without precached buffer

iozone

iozone is an open source file system benchmarking utility.

The default command line option is -a, which stands for full automatic mode. It tests with block sizes ranging from 4k to 16M and file sizes ranging from 64k to 512M.

     iozone -a /dev/sda1

iozone performs 13 types of tests. Some important tests are

Write : Indicates the speed of writing a new file to the filesystem. Creating a new file is always slower than rewriting an existing file owing to the metadata overhead involved while creating a new file (like creating a new inode entry).

ReWrite : Indicates the speed of writing to an existing file

Read : Indicates the speed of reading an existing file

ReRead : Indicates the speed of rereading a file that is already read.

RandomRead : Indicates the speed at which random areas of a single file are read.

RandomWrite : Indicates the speed at which we can write into random areas of a file





Tools for Benchmarking Network Performance


  • iperf : To measure bandwidth - Throughput between two nodes
  • sockperf ping-pong test : To measure network latency on TCP - https://code.google.com/p/sockperf/
  • PING : To measure network latency using ICMP

Wednesday, July 23, 2014

Difference between Incident and Problem

An incident as defined by ITIL is as follows: "An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident. For example Failure of one disk from a mirror set." A configuration item is just about anything in ITIL Terms.  Everything from a NIC or Hard Drive to a Web Service can be declared a Configuration Item. It's just something to tie incidents and problems to.
A problem as defined by ITIL is as follows: "A cause of one or more Incidents. The cause is not usually known at the time a Problem Record is created, and the Problem Management Process is responsible for further investigation."
Here are some other quick hit differences:
Problems should be traced down to a root-cause level where as incidents should be resolved as quickly as possible to restore the service to operation.
Incidents like a known issue with a Java application memory leak need become a Problem when they repeatedly occur.  
Incidents can generally be resolved by the person doing the work.
Problems must be closed by a manager assigned in the problem management process.

Tuesday, July 22, 2014

How to find number of CPUs in Linux?

Nowadays, each CPU made up of multiple CORES and each CORE is now treated as a separate CPU.

Also, if HYPERTHREADING is enabled, then each CORE is treated as TWO SEPARATE LOGICAL UNITS and these two LOGICAL UNITS too are treated as separate CPUs.

dmidecode -t processor | egrep "Socket Designation|Core Count|Thread Count"
        Socket Designation: CPU1
        Core Count: 4
        Thread Count: 8
        Socket Designation: CPU2
        Core Count: 4
        Thread Count: 8

As per command dmidecode, there are
Two CPUs
Eight Cores - Each CPU has 4 cores
Sixteen Threads - Each Core has two threads

Hence, the number of CPUs in the system is 16

This can be verified by other means too

grep 'processor' /proc/cpuinfo
processor       : 0
processor       : 1
processor       : 2
processor       : 3
processor       : 4
processor       : 5
processor       : 6
processor       : 7
processor       : 8
processor       : 9
processor       : 10
processor       : 11
processor       : 12
processor       : 13
processor       : 14
processor       : 15



# Count the number of “physical processor(s)
grep "physical id" /proc/cpuinfo | sort -u | wc -l

# Count the number of “physical cores per CPU”
grep "cpu cores" /proc/cpuinfo |sort -u |cut -d":" -f2

# Count the number of “logical cores ” (including multi-threading cores)
grep -c "processor" /proc/cpuinfo





LoadBalancer or RoundRobin DNS

The advantage of using LoadBalancer over RoundRobin DNS are

  • LoadBalancer takes care of load on nodes behind it by directing traffic/requests to node with least load
  • LoadBalancer takes care of sessions/connections. For applications like shopping cart, which makes use of the sessions, in case if the node from which we are served goes down, the session is lost too. 
  • If one of the nodes behind the LoadBalancer goes down, then the LoadBalancer will immediately detect it and will not send any requests to that node which is down. This is not possible in RoundRobin DNS.

Saturday, May 17, 2014

json output in Shell script

Parsing JSON output in Shell script

jq is a useful utility to parse JSON output in a shell script

jq can be downloaded as a binary and scp'ed to any host and it works just fine.

For more info on jq refer http://stedolan.github.io/jq/

Let us put that to work. The following curl command throws up a JSON output

# curl -s http://localhost:12345/metrics
{"SOURCE.scribeSrc":{"OpenConnectionCount":"0","AppendBatchAcceptedCount":"0","AppendBatchReceivedCount":"0","Type":"SOURCE","EventAcceptedCount":"1850929558","AppendReceivedCount":"0","StopTime":"0","EventReceivedCount":"1850929558","StartTime":"1399527923731","AppendAcceptedCount":"0"},"SINK.avroSink3":{"BatchCompleteCount":"1786","ConnectionFailedCount":"8","EventDrainAttemptCount":"463854128","ConnectionCreatedCount":"1135","BatchEmptyCount":"251379","Type":"SINK","ConnectionClosedCount":"1134","EventDrainSuccessCount":"463852531","StopTime":"0","StartTime":"1399527920721","BatchUnderflowCount":"1222487"},"SINK.avroSink2":{"BatchCompleteCount":"1826","ConnectionFailedCount":"8","EventDrainAttemptCount":"464081974","ConnectionCreatedCount":"882","BatchEmptyCount":"251234","Type":"SINK","ConnectionClosedCount":"881","EventDrainSuccessCount":"464080652","StopTime":"0","StartTime":"1399527920721","BatchUnderflowCount":"1226221"},"SINK.avroSink1":{"BatchCompleteCount":"1830","ConnectionFailedCount":"8","EventDrainAttemptCount":"462363915","ConnectionCreatedCount":"882","BatchEmptyCount":"250488","Type":"SINK","ConnectionClosedCount":"881","EventDrainSuccessCount":"462362687","StopTime":"0","StartTime":"1399527920720","BatchUnderflowCount":"1219047"},"SINK.avroSink":{"BatchCompleteCount":"1881","ConnectionFailedCount":"8","EventDrainAttemptCount":"460631266","ConnectionCreatedCount":"882","BatchEmptyCount":"250796","Type":"SINK","ConnectionClosedCount":"881","EventDrainSuccessCount":"460630534","StopTime":"0","StartTime":"1399527920720","BatchUnderflowCount":"1219204"},"CHANNEL.fileChannel":{"EventPutSuccessCount":"1850929558","ChannelFillPercentage":"0.003154","Type":"CHANNEL","StopTime":"0","EventPutAttemptCount":"1850929558","ChannelSize":"3154","StartTime":"1399527920713","EventTakeSuccessCount":"1850926404","ChannelCapacity":"100000000","EventTakeAttemptCount":"1856822139"}}[12:46:23]

To make it readable, let us use jq command

# curl -s http://localhost:12345/metrics | /tmp/jq '.'
{
  "CHANNEL.fileChannel": {
    "EventTakeAttemptCount": "1857737035",
    "ChannelCapacity": "100000000",
    "EventPutSuccessCount": "1851841723",
    "ChannelFillPercentage": "0.001698",
    "Type": "CHANNEL",
    "StopTime": "0",
    "EventPutAttemptCount": "1851841723",
    "ChannelSize": "1698",
    "StartTime": "1399527920713",
    "EventTakeSuccessCount": "1851840025"
  },
  "SINK.avroSink": {
    "BatchUnderflowCount": "1219720",
    "StartTime": "1399527920720",
    "StopTime": "0",
    "BatchCompleteCount": "1881",
    "ConnectionFailedCount": "8",
    "EventDrainAttemptCount": "460864030",
    "ConnectionCreatedCount": "882",
    "BatchEmptyCount": "250852",
    "Type": "SINK",
    "ConnectionClosedCount": "881",
    "EventDrainSuccessCount": "460863849"
  },
  "SINK.avroSink1": {
    "BatchUnderflowCount": "1219539",
    "StartTime": "1399527920720",
    "StopTime": "0",
    "BatchCompleteCount": "1830",
    "ConnectionFailedCount": "8",
    "EventDrainAttemptCount": "462602212",
    "ConnectionCreatedCount": "882",
    "BatchEmptyCount": "250540",
    "Type": "SINK",
    "ConnectionClosedCount": "881",
    "EventDrainSuccessCount": "462601149"
  },
  "SINK.avroSink2": {
    "BatchUnderflowCount": "1226686",
    "StartTime": "1399527920721",
    "StopTime": "0",
    "BatchCompleteCount": "1826",
    "ConnectionFailedCount": "8",
    "EventDrainAttemptCount": "464310084",
    "ConnectionCreatedCount": "882",
    "BatchEmptyCount": "251284",
    "Type": "SINK",
    "ConnectionClosedCount": "881",
    "EventDrainSuccessCount": "464309471"
  },
  "SINK.avroSink3": {
    "BatchUnderflowCount": "1222934",
    "StartTime": "1399527920721",
    "StopTime": "0",
    "BatchCompleteCount": "1786",
    "ConnectionFailedCount": "8",
    "EventDrainAttemptCount": "464067716",
    "ConnectionCreatedCount": "1135",
    "BatchEmptyCount": "251438",
    "Type": "SINK",
    "ConnectionClosedCount": "1134",
    "EventDrainSuccessCount": "464065556"
  },
  "SOURCE.scribeSrc": {
    "AppendAcceptedCount": "0",
    "StartTime": "1399527923731",
    "OpenConnectionCount": "0",
    "AppendBatchAcceptedCount": "0",
    "AppendBatchReceivedCount": "0",
    "Type": "SOURCE",
    "EventAcceptedCount": "1851841723",
    "AppendReceivedCount": "0",
    "StopTime": "0",
    "EventReceivedCount": "1851841723"
  }
}



Monday, February 3, 2014

Puppet - How to set up - A simple Puppet Master and Puppet Client setup

Let me illustrate a simple set up of Puppet master and Puppet client, where a file /tmp/puppet-test-file will be automatically created in client once it is setup in Puppet master.

Puppet Master - 192.168.1.33
Puppet Client - 192.168.2.101

Setting up Puppet Master


Install puppet, puppet-server, facter 
      yum install puppet puppet-server facter

Puppet’s principal configuration file is puppet.conf
   In OpenSource Puppet, puppet.conf file is generally in the path
    /etc/puppet/puppet.conf

  In Puppet Enterprise:
    /etc/puppetlabs/puppet/puppet.conf

When running Puppet Master as normal user, puppet.conf file can be placed in
   /home/user/.puppet/puppet.conf

Here we are using OpenSource puppet. So edited /etc/puppet/puppet.conf file to add the following content

[master]
  certname=192.168.1.33

Create Puppet site.pp file
    touch /etc/puppet/manifests/site.pp

Initialize site.pp with following contents
    import ‘nodes.pp’

Now create a file nodes.pp for Node Definitions
   touch /etc/puppet/manifests/nodes.pp

Add default node definitions in nodes.pp, so that it becomes applicable to all the agents connecting to it.

node default {
       file { "/tmp/puppet-test-file":
         replace => "no", # this is the important property
         ensure  => "present",
         content => "From Puppet\n",
         mode    => 644,
       }
}

Start Puppet Master
  service puppet master start

Puppetmaster listens on port 8140
   netstat -anp | grep ruby

Puppet Client(192.168.2.101) configuration steps

Install puppet and facter
  yum install puppet facter

Edit /etc/puppet/puppet.conf to add master server details
   server=192.168.1.33

Before starting puppet service in client machine(192.168.1.33), run the command
    puppet agent —test
  Exiting: no certificate found and waitforcert is disabled

In the puppet master(192.168.1.33) run the command
  puppet cert list
  "centos32" (18:B9:34:16:B9:37:1C:59:7D:2B:DF:EE:FE:0F:C9:8A)
Now accept the cert request by running the following command in puppet master(192.168.1.33)
   puppet cert sign centos32
notice: Signed certificate request for centos32
notice: Removing file Puppet::SSL::CertificateRequest centos32 at '/var/lib/puppet/ssl/ca/requests/centos32.pem'
 
In puppet client(192.168.2.101, centos32) start the puppet client  
        service puppet start

On puppet client(agent - 192.168.2.101), now just observe the entries for puppet in /var/log/messages
Jan 19 11:41:07 centos32 puppet-agent[2027]: (/Stage[main]//Node[default]/File[/tmp/puppet-test-file]/ensure) created
Jan 19 11:41:07 centos32 puppet-agent[2027]: Finished catalog run in 0.04 seconds

Verification

Check if the file /tmp/puppet-test-file is present in puppet client machine 192.168.2.101

[root@centos32 ~]# ls -l /tmp/puppet-test-file 
-rw-r--r-- 1 root root 12 Jan 19 11:41 /tmp/puppet-test-file

Saturday, February 1, 2014

Decrease reserved disk space in ext3/ext4 filesystems

When I setup ext3 partition in my 1TB hard disk and ran df -h command, I was in for a surprise

/dev/rootvg/scribe 992G  407M  941G   1% /scribe

Out of 992GB, only 941GB is available. Almost 41GB is missingWe lost 5% of 992GB (.05*992=49.60, 992-49.60=942.40)

It seems that ext filesystems by default will reserve about 5% disk space for superuser level processes and to prevent filesystem from fragmenting as it fills up. However, this reserved space can be claimed.

To reduce reserved blocks from 5% to 2% use the following command:
# tune2fs -m 2 /dev/rootvg/scribe

# umount /scribe

# tune2fs -m 2 /dev/rootvg/scribe
tune2fs 1.39 (29-May-2006)
Setting reserved blocks percentage to 2% (5281218 blocks)

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/mapper/rootvg-scribe  992G  407M  971G   1% /scribe

Thus we have reclaimed 30GB of space.


Extending LVM

I have a 1TB hard disk. The 400GB of the hard disk(/dev/cciss/c0d0p2) is configured as LVM, with the remaining 600GB left unutilized. Now I want to extend the logical volume(LVM) by 600GB by utilizing the unused 600GB.

The LV name is /dev/rootvg/scribe
Get the current size of LVM /dev/rootvg/scribe by using lvdisplay command. It should show LV Size 389.97 GB

Currently the disk hard disk consists of just two partitions

/dev/cciss/c0d0p1 - This is configured as /boot partition and not part of LVM
/dev/cciss/c0d0p2 - Configured as part of LVM /dev/rootvg/scribe

# fdisk -l

Disk /dev/cciss/c0d0: 1199.8 GB, 1199865640960 bytes
255 heads, 63 sectors/track, 145875 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

           Device Boot      Start         End      Blocks   Id  System
/dev/cciss/c0d0p1   *           1          13      104391   83  Linux
/dev/cciss/c0d0p2              14       65283   524281275   8e  Linux LVM

Now let us go about extending the Logical Volume /dev/rootvg/scribe by adding extra 600GB.

1) Create a partition of the unused 600GB

fdisk /dev/cciss/c0d0

p - to display the existing partition table
n - to create new partition

Once created

p - should display /dev/cciss/c0d0p3 as regular Linux partition
t - to convert newly created partition(/dev/cciss/c0d0p3) to lvm type
Enter partition number - 3(probably)
Hex code : 8e (Now the partition is converted to lvm type)

p - should show that /dev/cciss/c0d0p3 is of type lvm
w - to write the partition table to the disk

2) The newly added partition will not be visible on the system
   Check the same using
    ls /dev/cciss/   - We shall see that c0d0p3 is missing

   partprobe - Running partprobe reloads the partition table and brings the partition up. No need to reboot the system.

    ls /dev/cciss/ - We shall now see the c0d0p3

3) Now need to add /dev/cciss/c0d0p3 as part of physical volume(PV) 
     pvdisplay
This will show only the physical volume (/dev/cciss/c0d0p2)
   PV Name               /dev/cciss/c0d0p2

Now run the command
    pvcreate /dev/cciss/c0d0p3
This command creates a header on each partition so it can be used for LVM.

Now run pvdisplay to see your new PV.
   pvdisplay

It should now show
  PV Name               /dev/cciss/c0d0p2
  PV Name               /dev/cciss/c0d0p3

"VG Name" for /dev/cciss/c0d0p3 should be empty.
Since "/dev/cciss/c0d0p2 is already part of LVM, the "VG Name" for /dev/cciss/c0d0p2 will be displayed as rootvg

Need to add /dev/cciss/c0d0p3 as part of the volume group rootvg

4) Run the vgdisplay command to get the "VG Name"
    vgdisplay

5) Use command vgextend to add the PV(/dev/cciss/c0d0p3) to the VG rootvg
  vgextend  rootvg  /dev/cciss/c0d0p3

6)  Run   pvdisplay command to see the value of "VG Name" for PV /dev/cciss/c0d0p3. It shall display as rootvg
   
 7) Run command vgdisplay to see the new size of the VG rootvg.
     Mainly note the value of "VG size". It should be nearly 1TB.
     Also note "Free  PE / Size" and "Total PE". Earlier "Free  PE / Size" shall be "0 / 0". Now it shall show some values.

8) Now comes the extension of Logical Volume /dev/rootvg/scribe
   Run the command
    lvdisplay
   to get the names of existing Logical Volumes.

  From "vgdisplay" command get the value of "Free PE/ Size". It will show the available "Physical Extent / Disk Size".

  Now extend the Logical Volume "/dev/rootvg/scribe"
   lvextend -l +19755 /dev/rootvg/scribe
               (OR)
   lvextend -L+617.34 /dev/rootvg/scribe

  Run the command
    lvdisplay

Check the "LV Size" for "LV Name : /dev/rootvg/scribe". It should be more than current "LV Size 389.97 GB".

 --- Logical volume ---
  LV Name                /dev/rootvg/scribe
  VG Name                rootvg
  LV UUID                P08Qra-MnjE-5tgd-4Muw-c7r5-kY7o-6zUAu2
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                1007.31 GB
  Current LE             32234
  Segments               2
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2

9) Now need to grow the filesystem for the LV /dev/rootvg/scribe

  umount /scribe
  e2fsck -f /dev/rootvg/scribe
  resize2fs /dev/rootvg/scribe
  mount -a

 Use "df" command to see the new filesystem size

# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/cciss/c0d0p1      99M   13M   81M  14% /boot
tmpfs                  12G     0   12G   0% /dev/shm
/dev/mapper/rootvg-scribe
                      992G  407M  941G   1% /scribe

xargs prints file names even if there are no matching files

A problem I faced with xargs command is that, it just lists the files/directories in the current directory if there is no input for the xargs command.

Let me illustrate with an example

In my current directory, I have just *.txt files

$ ls
abc.txt  ver1.txt  ver1.txt.orig  ver2.txt

When I try to fetch files with names of type *.jpg, I expected to get no output when I ran xargs with ls. But it printed all the file names in the current directory.

$ find . -type f -iname "*.jpg" | xargs ls
abc.txt ver1.txt  ver1.txt.orig  ver2.txt

The reason seems to be that, by default the command executed by xargs is /bin/echo and it will simply display file/directory names.

Later I found that, there exists an option "-r" for xargs, which did the trick for me

$ find . -type f -iname "*.jpg" | xargs -r ls