Thursday, February 16, 2012

Socket


A Socket = IP Address + a Port no.

Socket is a method for communication between a client program and a server program in a network or in a single computer. A socket is defined as the endpoint in a connection.

Wednesday, February 15, 2012

Passwordless Login: SSH Public Key Authentication

To encrypt and decrypt data, there are two types of cryptography algorithms

  • Symmetric key algorithm which uses a single key
  • Asymmetric key algorithm which uses two keys

Symmetric Key Algorithms


Symmetric key algorithms use a single key for encryption as well as decryption of data. The sender if he encrypts the data with a key, the recipient should also have that key to decrypt that data. If there are just two users, then a single key can be shared between them and can be used for encryption and decryption of data. However, if there are more than two users, then each user has to generate a key for each other user and share with that user, so that data encrypted by this user can be decrypted by the other user using the shared key. This is the biggest bottleneck of single key algorithm.

The most common symmetric key algorithms are DES, 3DES, Blowfish and IDEA.

Asymmetric Key Algorithms


Asymmetric key algorithms, uses two keys - one for encryption and the other for decryption of data. Data encrypted by one key can be decrypted only by the other key. So a user who wants to share encrypted data with others, generates a key pair - Private/Public. The user uses the private key for encryption of data and keeps this key with himself(does not share with others). He then distributes the public key to other users to whom he wants to send encrypted data. The other user upon receiving the encrypted data, decrypts using the public key sent to this user. Hence, asymmetric key algorithm is also known as public/private key algorithm.

SSH Passwordless login, uses this asymmetric key algorithm to achieve it's objective.

Public-key authentication allows us to prove our identity to a remote host using a cryptographic key instead of a login password. SSH keys are more secure than passwords  because keys are never transmitted over network while passwords are sent over network.
By generating a key pair, consisting of a public key (shared with everyone) and a private key (which is kept secret with you and not shared with anybody) public key authentication can be achieved. The private key is able to generate signatures. A signature created using your private key cannot be forged by anybody who does not have that key; the public key, which is shared with everyone, can verify that particular signature is genuine.The private key is stored on your local machine, also called as client.The public key is copied to the remote (server) machine.
However, there is a problem here: if your private key is stored unprotected on your own computer, then anybody who is able to gain access to that shall be able to generate signatures as if they were you. So they will be able to log in to your server under your account. For this reason, your private key is usually encrypted when it is stored on your local machine(client), using a passphrase of your choice.

How to setup a public-key authetication between an OpenSSH client and an OpenSSH server?

  • Generate public-private key pair as follows

$ mkdir -p ~/.ssh
$ chmod 700 ~/.ssh
$ cd ~/.ssh
$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/test/.ssh/id_dsa): 
Enter passphrase (empty for no passphrase): *******
Enter same passphrase again: *******
Your identification has been saved in id_dsa
Your public key has been saved in id_dsa.pub.

  • Copy the public key to the remote host

$ scp -p id_dsa.pub remoteuser@remotehost
Password: ******

  • Installing public key in remote host(ssh server)

#Login to remote host
$ ssh remoteuser@remotehost
  Password: *****
remotehost$ mkdir -p ~/.ssh
remotehost$ chmod 700 ~/.ssh
remotehost$ cat id_dsa.pub >> ~/.ssh/authorized_keys
remotehost$ chmod 600 ~/.ssh/authorized_keys

  • In the remote host, the ssh server must be configured to permit public-key authentication

/etc/ssh/sshd_config:
PublickeyAuthentication yes

SSH:A Brief Idea


SSH(Secure Shell) uses a client-server model. SSH ensures that everything sent across the network, between client and server,is encrypted, including password. Basically, SSH uses public-key cryptography. The SSH daemon(server), listening on port 22, offers the public key to clients and keeps the private key to itself. This public-private key is called the host key. The client communicating with ssh server, sends a chunk of data by encrypting with the public key received from the server; the server then decrypts the data with the private key.Since both the public and private keys are necessary to complete the transaction, the data remains secure; even if someone captures the SSH traffic between client and server, all they see shall be a garbage.

Private and public host keys, needed for ssh connection, are available under the path /etc/ssh/ as follows

SSH Version 2 Host keys

DSA keys
  • ssh_host_dsa_key (Private host key)
  • ssh_host_dsa_key.pub (Public host key, to be shared with client, when the client tries to establish a connection)
RSA keys
  • ssh_host_rsa_key (Private host key)
  • ssh_host_rsa_key.pub (Public host key, to be shared with client, when the client tries to establish a connection)

SSH Version 1 Host keys
  • ssh_host_key    (Private host key)
  • ssh_host_key.pub (Public host key, to be shared with client, when the client tries to establish a connection)
How does the client get the public host key of the remote machine running ssh daemon(server)?

Say, the client is trying to establish a ssh connection to remotehost 

$ ssh remotehost
The authenticity of host 'remotehost' can't be established. RSA key fingerprint is 98:2e:d7:e0:de:9f:ac:67:28:c2:42:2d:37:16:58:4d.   Are you sure you want to continue connecting (yes/no)? 

The client does two things
  1. It retrieves the public host key from the remote host
  2. It checks if that retrieved public host key is already available with it, by checking with the host keys list(available in ~/.ssh/known_hosts)
If the retrieved public host key is already available with it(available in ~/.ssh/known_hosts), then the client assumes that it is talking to the correct host.
If the public host key of the remotehost is not available with the client, then it presents the fingerprint for the approval. Once you validate the host key the first time around, then in the subsequent logins, you shall not be prompted for confirmation again. 

When logging to a host, whose host key had been already validated, why do we sometimes see the message WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! ?

This can happen due to two reasons
  1. If OpenSSH was reinstalled on the remote host and original host key was not restored, OR
  2. The remote host was replaced by another host


Virtualisation Intro

Wednesday, February 8, 2012

netcat - File Transfer and Port Scanning

netcat command finds it's uses for
  • File Transfer
  • Port Scanning
File Transfer using netcat
  
Using netcat, the server can either send or retrieve data

Scenario 1 : Server sending the file to client 

# start the sending server
$ cat testfile | nc -l -p 13000

# start the retrieving client
$ nc <server> 13000 > testfile


Scenario 2 : Client sending the file to server
# start the retrieving server
$ nc -l -p 13000 > testfile

# start the sending client
$ cat testfile | nc <server> 13000


To monitor the progress of file transfer 


Using pv command we can monitor the progress of file transfer

# start sending server
$ cat test.iso | pv -b | nc -l  13000

# start receiving client
$ nc <server> 13000 | pv -b > test.iso
  11B 0:00:08 [1.32B/s ] [ <=> 


Transfer Compressed Data

# create an ISO image on the fly and compress the data stream
$ dd if=/dev/sr0 | gzip -9 | nc -l  13000

# retrieve and decompress the data stream at client side
$ nc <server> 13000 | gunzip | pv -b > testdvd.img


Port Scanning using netcat

# scan ports within the interval [20..80]
$ nc -v -z  <mywebsite.com> 21-80

# scan local ports [21..25], 80 and 8080
$ echo QUIT | nc -v -z localhost 21-25 80 8080
localhost [127.0.0.1] 25 (smtp) open
localhost [127.0.0.1] 22 (ssh) open
localhost [127.0.0.1] 80 (www) open

Good Reference:
http://injustfiveminutes.com/2013/11/19/netcat-cheat-sheet/



Difference between Process and Threads

A process has five fundamental parts:
  • code ("text")
  • data (VM)
  • stack
  • file I/O, and 
  • signal tables. 
Processes have a significant amount of overhead when switching: all the tables have to be flushed from the processor for each task switch. Also, the only way to achieve share information between processes is through pipes and "shared memory". If a process spawns a child process using fork(), the only part that is shared is the text.

    Threads reduce overhead by sharing fundamental parts. By sharing these parts, switching happens much more frequently and efficiently. Also, sharing information is not so "difficult" anymore: everything can be shared. There are two types of threads: user-space and kernel-space.

When we create a new thread in a process, the new thread of execution gets it's own stack(and hence local variables) but shares global variables, file descriptors, signal handlers, and its current directory state with process that created it.

When a process executes a "fork" call, a new copy of process is created with it's own variables and it's own pid. The new process is schedules independently and in general executes almost independently of the process that created it.

 Every process is protected from every other process in the system by the kernel using a Memory Management Unit (MMU). Since each process is independent of the others the kernel can schedule several to execute in parallel when there are several CPUs or cores to schedule on.

Threads enhance the process model with multiple, parallel, flows of execution within a process. All threads within a process share the same memory space.

The kernel treats threads as separate and independent entities so it can schedule several threads to run in parallel, just as it can with complete processes.

So the key difference between processes and threads is the way memory is managed. The two of the most most important things about thread are
  • Inter-thread communication is fast
  • There is no protection between threads
Since processes do not naturally share memory it is difficult for one process to communicate with another. Several Inter-Process Communications (IPC) methods exist but they all rely on passing data via some intermediary such as the file system or network stack. Ultimately the kernel manages communications between them.

Threads, on the other hand, can communicate directly using shared memory objects such as arrays of data (buffers).The disadvantage of threads is that  bug in one thread can corrupt the memory being used by another thread.

POSIX is the standard for threads

POSIX.1 specifies a set of interfaces (functions, header files) for threaded programming commonly known as POSIX threads, or Pthreads.  A single process can contain multiple threads, all of which are executing the same program.These threads share the same global memory (data and heap segments), but each thread has its own stack (automatic variables).

POSIX.1 also requires that threads share a range of other attributes (i.e.these attributes are process-wide rather than per-thread):

       -  process ID

       -  parent process ID

       -  process group ID and session ID

       -  controlling terminal

       -  user and group IDs

       -  open file descriptors

       -  record locks

      -  signal dispositions

       -  file mode creation mask

       -  current directory and root directory

       -  interval timers and POSIX timers

       -  nice value

       -  resource limits

       -  measurements of the consumption of CPU time and resources

       As well as the stack, POSIX.1 specifies that various other attributes are
       distinct for each thread, including:

       -  thread ID

       -  signal mask

       -  the errno variable

       -  alternate signal stack

       -  real-time scheduling policy and priority

How to view processes and threads in Linux?

To see every process on the system using standard syntax:
  • ps -e
  • ps -ef
  • ps -eF
  • ps -ely
To see every process on the system using BSD syntax:
  • ps ax
  • ps axu
 To print a process tree:
  •  ps -ejH
  • ps axjf
To get info about threads:
  •  ps -eLf
  • ps axms
To get security info:
  • ps -eo euser,ruser,suser,fuser,f,comm,label
  • ps axZ
  • ps -eM
To see every process running as root (real & effective ID) in user format:
  •   ps -U root -u root u
To see every process with a user-defined format:
  • ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
  • ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
  • ps -eopid,tt,user,fname,tmout,f,wchan
Print only the process IDs of syslogd:
  •  ps -C syslogd -o pid=
Print only the name of PID 42:
  •  ps -p 42 -o comm=

Mysql is a good example, which uses threads for managing client connections.

[root@dhcppc3 ~]# ps -eL | grep mysql
   2787    2787 pts/0    00:00:00 mysqld_safe
   2916    2916 pts/0    00:00:00 mysqld
   2916    2918 pts/0    00:00:00 mysqld
   2916    2919 pts/0    00:00:00 mysqld
   2916    2920 pts/0    00:00:00 mysqld
   2916    2921 pts/0    00:00:00 mysqld
   2916    2923 pts/0    00:00:00 mysqld
   2916    2924 pts/0    00:00:00 mysqld
   2916    2925 pts/0    00:00:00 mysqld
   2916    2926 pts/0    00:00:00 mysqld
   2916    2927 pts/0    00:00:00 mysqld

So we observe that all the mysql threads have same process id - 2916, in this example.
A more elaborate output

[root@dhcppc3 ~]# ps H -Le | grep mysql
   2787    2787 pts/0    S      0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --pid-file=/var/run/mysqld/mysqld.pid --basedir=/usr --user=mysql
   2916    2916 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2918 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2919 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2920 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2921 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2923 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2924 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2925 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2926 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock
   2916    2927 pts/0    Sl     0:00 /usr/libexec/mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock




 


Why du and df show different results?

Firstly, let us see how df and du work.

df command makes a filesystem call and gets the details of disk space used directly from the superblocks of the filesystem.

du traverses(walks) the entire filesystem, checking the size of each each file and calculates the disk space used by keeping track of size of each file.

Sometimes df will show more disk space used than du. Why so?

Couple of reasons
  •  Process with open files which are however deleted
  • Running du as non-root user may not allow us to read the directories with restricted read permissions.

Let us examine case 1
Say, suppose a process is writing to a log file and is filling up disk space. To free up disk space we delete that log file to which the process is writing to. Now in this case, du will show that disk space has freed up and will show more free space available. However, df will not show up the freed space and we will instead find that disk space is fast filling up.
When a Linux process opens a file, the reference count to that file is incremented. When a file is removed, though the disk blocks asscoiated with the file gets freed up, the reference to the file opened by the process still exists as the process is still running. Hence, df and du show different disk space usage results. Stopping the process and removing the log file is the right way to do. Then df and du shall show the same disk usage result.

AMQP, RabbitMQ

A very informative link which explains AMQP quite well


http://virtualization-for-layman.blogspot.in/2011/06/what-is-rabbitmq.html


Monday, February 6, 2012

Service Availability - MTTR, MTTD, MTTF, MTBF


Availability of a Website or service(A) = Uptime / (Uptime + Downtime)

Availability of a website or service is measured on four important parameters

  • MTTD(Mean Time To Diagonise) = The average time it takes to diagonise the problem
  • MTTR(Mean Time To Repair) = The average time it takes to fix a problem
  • MTTF(Mean Time To Failure) = The average time there is a correct behaviour 
  • MTBF(Mean Time Between Failures) = The average time between failure of service(average time between two failures)

MTBF = MTTD + MTTR + MTTF

Availability  = MTTF / MTBF

Service Availability expressed in terms of 9s

90% (one 9) Availability = 36 days,12 hours downtime/year
99% (two 9s) Availability = 87 hours, 36 minutes downtime/yr
99.9% (three 9s) Availability = 8 hours,45 minutes, 36 seconds downtime/yr
99.99% (Four 9s) Availability = 52 minutes,33 seconds downtime/yr
99.999% (Five 9s) Availability = 5 minutes,15 seconds downtime / yr
99.9999% (Six 9s) Availability = 31 seconds downtime/yr

If we say we provide, 99.9% uptime availability per month, then permissible downtime is calculated as follows

30 days = 720 hours = 43200 minutes
99.9% of 43200 minutes = 43156.8 minutes
43200 minutes - 43156.8 minutes = 43.2 minutes

So the service can go down without penalty for 43.2 minutes per month




Hard Link Vs Symbolic(Soft) Link

Hard Link
  • Hard Link is a mirror copy of the original file.
  • Hard links share the same inode value. Hence same file permissions.Even if original file is deleted, hard links to the file are not affected.
  • Cannot create hard links on directories. 
  • Cannot create hard links across filesystems.
  • Backup of the hardlink would be the size of the file, while with the symlink only the link itself would be backed up.
  • Size of hardlink is equal to the size of the original file. Hence, creation of hardlink increases disk space usage.
Soft Link
  • Soft Link is a symbolic link to the original file.
  • Soft Links will have a different inode value from the original file. Hence different file permissions.
  • If original file is deleted, soft link fails.
  • Can create soft links on directories
  • Soft links can cross file systems
  • Size of softlink is the number of characters in the name of softlink
More about links

Creation of Hard link to file test.txt
$ ln test.txt  htest.txt

 Creation of Soft link to file test.txt
$ ln -s test.txt   ltest.txt



Let us examine the size of files

$ ls -l *test.txt
-rw-rw-r--. 2 ghost ghost 452263 Feb  7 01:38 htest.txt
lrwxrwxrwx. 1 ghost ghost      8 Feb  7 01:45 ltest.txt -> test.txt
-rw-rw-r--. 2  ghost ghost 452263 Feb  7 01:38 test.txt



There are 8 characters in the soft link ltest.txt. Hence the size of soft link is 8.

Let us examine the inodes of the files

$ ls -i *test.txt
263871 htest.txt  
262315 ltest.txt  
263871 test.txt

We can see that hard link(htest.txt) and original file(test.txt) share the same inode value 263871


To find files having hard links

$ find . -type f -links +1

Why hard links cannot be created on directories?

When a hard link is created to a directory, it increases the reference count for that directory.Let us examine what happens if hard links can be created for a directory - Take the following case a/b/c -> a. Here c is hard link to directory a. So a loop is created here, where subdirectory c points to grandparent directory a. So deleting a directory or traversing a directory for commands like find & du, are every expensive operations on disk somtimes causing to go on infinite recursion because of loop.

In case of creating a softlink for a directory, the reference count of the original directory is not affected. Hence soft link does not prevent the original directory being deleted and removed from disk. Also kernel does not follow symlinks for traversal operations like du or find.

Saturday, February 4, 2012

What is my public ip address?


The link http://www.commandlinefu.com/commands/tagged/707/ip-address provides many ways to get our public ip address. Among many ways, the following ones looked simple and instant
  • $curl ifconfig.me
  • $dig @208.67.222.222 myip.opendns.com
     



MBR Partition Table Vs GUID Partition Table(GPT)


There are two standards for the layout of the partition table on a physical hard disk - MBR and GPT.

To partition block devices, there are currently two partitioning technologies in use: MBR and GPT

The MBR(Master Boot Record) uses 32-bit identifiers to specify the start sector and length of the partitions.

The GPT(GUID Partition Table) setup uses 64-bit identifiers for the partitions. The location in which it stores the partition information is also music bigger than 512 bytes of an MBR.

When a system’s software interface between the OS and firmware is UEFI(instead of BIOS), GPT is almost mandatory as compatibility issues with MBR will arise here.

GPT also has the advantage that it has a backup GPT at the end of the disk, which can be used to recover damage of the primary GPT at the beginning. GPT also carries CRC32 checksums to detect errors in the header and partition tables.

Using GPT on a BIOS based computer works, but one cannot dual boot with Windows OS because windows will boot in EFI mode if it detects a GPT partition label.

As per Wikipedia,  MBR partitioning scheme, dating from the early 1980s, imposed limitations which affected the use of modern hardware. Intel therefore developed a new partition-table format, GPT, in the late 1990s as part of what eventually became UEFI.

At this juncture, we need to tell about BIOS and UEFI. The BIOS was originally created for the IBM PC; while BIOS has evolved considerably to adapt to modern hardware, Unified Extensible Firmware Interface (UEFI) is designed to support new and emerging hardware.

Traditional BIOS supports a maximum disk size of 2.2TB. By using GPT, disk sizes greater than 2.2TB can be supported for BIOS. However, there is a catch while using GPT with BIOS - GPT can only be used for data disks; it cannot be used for boot drives with BIOS; therefore, boot drives can only be a maximum of 2.2TB in size if BIOS is used. UEFI overcomes these shortcomings.


MBR - Master Boot Record parition table

  • Supports just 4 primary partitions or 3 primary + 1 extended partition. In extended partition, we can define many logical partitions.
  • Partitions with size exceeding 2TB cannot be created using MBR
  • fdisk command can be used for creating MBR partitions


GPT - GUID Partition Table

  • 128 primary partitions can be created. Hence no need for extended or logical partitions.
  • GPT has 2 tables : One primary and the other secondary, for backup purpose
  • Allows creation of partition sizes larger than 2TB. Max partition size is 9.4 zettabytes.
  • In order to use GPT, GPT support must be enabled in kernel. CONFIG_EFI_PARTITION must be set to Y and kernel recompiled.
  • By using GNU parted command, gpt partitions can be created
Bootloader Support
  • GRUB-Legacy bootloader does not support GPT
  • GRUB2 provides the ability to boot from GPT in both BIOS and UEFI based systems. 

More info can be found in 





Difference between SIGTERM(15) and SIGKILL(9)

When kill command  is issued to a process, by default SIGTERM(15) signal is called. In case if the process does not terminate, we issue kill -9 (SIGKILL)

SIGTERM can be blocked, handled or ignored by the process, but SIGKILL cannot be handled or ignored by the process.

Upon receiving a SIGTERM, the process shall either stop after cleaning up resources or can ignore the signal and keep running indefinitely. However, in case of SIGKILL, the process cannot handle or ignore this signal as it will never receive this signal - the signal goes straight to the kernel init which in turn will stop the process.

Timestamps associated with every file/directory in Unix/Linux filesystem


There are three timstamps associated with every file(or directory) in Unix/Linux  filesystem

  • Access Time(atime) - The last time the file (or directory) data was accessed(read the file's contents)
  • Modification Time(mtime) - The last time the file (or directory) data was modified(written or created)
  • Change Time(ctime) - The last time the inode status was changed(modify the file data or its attributes)
Using stat command we can get to know the all the three timestamps

$ stat bookmarks.html
Access: 2011-10-30 21:11:00.787309030 +0530
Modify: 2011-10-02 19:22:08.660739370 +0530
Change: 2011-10-02 19:22:08.660739370 +0530

Scenario 1: Just access the file
$ less bookmarks.html

$ stat bookmarks.html
Access: 2011-10-30 21:11:00.787309030 +0530
Modify: 2011-10-02 19:22:08.660739370 +0530
Change: 2011-10-02 19:22:08.660739370 +0530

We can see that atime is modified above

Scenario 2:Change contents of the file
Let us modify the file bookmarks.html
$ vi bookmarks.html

$ stat bookmarks.html
Access: 2012-02-05 00:09:16.690477352 +0530
Modify: 2011-10-02 19:22:08.660739370 +0530
Change: 2011-10-02 19:22:08.660739370 +0530

We can now see that all three timestamps - atime,mtime and ctime are changed

Scenario 3: Change the attributes of a file
Let us change the file attributes
$ chmod 644 bookmarks.html

$ stat bookmarks.html
Access: 2012-02-05 00:11:06.814915133 +0530
Modify: 2012-02-05 00:11:06.814915133 +0530
Change: 2012-02-05 00:24:20.619104936 +0530

We can see that ctime alone is modified now

Time of last file modification(mtime): ls -l
Time of last access(atime): ls -lu
Time of last inode modification(ctime): ls -lc 
Note about atime


If atime updates are enabled by default in Linux/Unix file systems, then every time a file is read, its inode needs to be updated. These atime updates can require a significant amount of unnecessary write traffic and file-locking traffic. That traffic can degrade performance; therefore, it may be preferable to turn off atime updates. So, while mounting the filesystem, specify the option noatime  in /etc/fstab or in mount command.


How to install flash plugin in Linux

I run CentOS-6.2 64 bit linux.

To install flash plugin for use in firefox, here are the steps to follow
  • Go to http://get.adobe.com/flashplayer/
  • Select any one option under version to download
  • Selected YUM for Linux (YUM)
  • Downloaded adobe-release-x86_64-1.0-1.noarch.rpm
  • As root user run rpm -ivh adobe-release-x86_64-1.0-1.noarch.rpm
  • As root user run yum install flash-plugin
  • Restart firefox

Friday, February 3, 2012

Beyond init: systemd

A video link describing systemd - a replacement for  init

http://blip.tv/linuxconfau/beyond-init-systemd-4715015#disqus_thread

http://bsdmag.org/

Must visit for those interested in BSD world. Magazine has some interesting topics.

http://bsdmag.org

Zombie process

Zombie is the phase in which the child process terminates and is not removed from the system process table until the parent process acknowledges the death of the child process. In this case, the child process is said to be in a Zombie state.
Note that the process in the Zombie state is not alive and does not use any resources or accomplish any work. However, it is not allowed to die until the parent process acknowledges the exit call. If a process is creating a lot of zombies, it has a programming bug or error in its code and isn't working correctly.

Will too many Zombie Processes create issues?

While zombie processes aren't a problem as they do not take up any resource, there is one concern. Linux systems have a maximum amount of processes and thus process ID numbers. If a computer has enough zombie processes, the maximum amount is reached and new processes can't be launched.

The maximum amount of processes can be listed by typing the "cat /proc/sys/kernel/pid_max" in a terminal window and is usually 32768.

Pipes : Named and UnNamed

In Unix/Linux world, pipes allow separate processes to communicate without having been designed explicitly to work together.

There are 2 kinds of pipes : unnamed and named. They are well explained in the following links

Difference between unnamed pipes and named pipes

Unnamed pipe

  1. These are created by the shell automatically.
  2. They exist in the kernel.
  3. They can not be accessed by any process,including the process that creates it.
  4. They are opened at the time of creation only.
  5. They are unidirectional.
  6. Unnamed pipes may be only used with related processes (parent/child or child/child having the same parent).
Eg:

ls | less

Here, both the processes are related - have same parent(the bash shell process is the parent of both ls and less command). It is unidirectional in the sense that output of ls is sent as input of less command

Named Pipe ( also called FIFO, First In FIrst Out)

  1. They are created programatically using the command mkfifo or mknod command.
  2. They exist in the file system with a given file name.
  3. They can be viewed and accessed by any two unrelated processes. ls cmd shows "p" in the permission bits for a named pipe.
  4. They are not opened while creation.
  5. They are Bi-directional.
  6. A process writing a named pipe blocks until there is a process that reads that data.
  7. Broken pipe error occurs when the writing process closes the named pipe while another reading process reads it.
Eg:

$ mkfifo nampipe

$ ls -l nampipe
prw-rw-r-- 1 foo foo 0 Jan 13 20:38 nampipe

So a pipe with name, nampipe, is created now.

Let us see how it works.

In one console, type the following command

$ ls -l > nampipe

The above command shall hang. This is because the other end of the pipe, nampipe, is not yet connected, and so the kernel suspends the above  first process until the second process opens the pipe.

Now let us run the second process, which opens the other end of the pipe, nampipe, in another console


$ cat < nampipe
total 636
drwxrwxr-x  4 foo foo   4096 Jan  5 22:17 a
-rwxrwxr-x  1 foo foo   9509 Jan  5 22:11 a.out
-rw-rw-r--  1 foo foo  10240 Jan  3 19:50 archive.tar

See, here first process and second process are unrelated.  The first process and second process are run in different shell consoles(hence, both process have different parent ids).


Differences between files and pipes 


  1. pipes have the bounded size - the maximum capacity of a pipe is referenced by the constant PIPE_BUF (usually limited to 5120 bytes.) 
  2. sequential access – one can only read or write from and to the pipe, the pointer of the current position can not be moved (lseek is unacceptable) 
  3. write appends data to the input of a pipe while read reads any data from output of a pipe but: 


  • data that is read are removed from the pipe 
  • if the pipe is empty and at least one descriptor for reading is open, then the read is blocked until some data are written to the pipe or until the pipe descriptor is closed  
  • the process is blocked if it wants to write and the pipe is full