Sunday, April 21, 2013

Connecting to a remote website: Explained


From a browser, when we access a URL, let us examine what happens actually.

In the client side, part of the URL is parsed into a hostname and the hostname is translated into a IP address. The browser is assigned an unprivileged port(for example TCP port 14000) for the connection. A HTTP message is constructed for the web server. Its encapsulated in a TCP message, wrapped in an IP packet header and sent out to the web server listening on port 80.

Now a TCP connection need to be established with the server using a three-way handshake :

When the client program(say, browser) sends it's first connection message, the SYN flag is accompanied by a synchronization sequence number. This sequence number is used as the starting point to number all the rest of the data bytes the client will send.

On the server machine, the kernel hands SYN to the server. The server is listening on port 80 and it's notified of an incoming connection request (the SYN connection synchronization request flag) from the source IP address and port socket pair(client IP address,14000). Server calls accept(). The server allocates a new socket on it's end(web server IP address, 80) and associates it with the client socket.

The kernel in the server machine, sends an acknowledgment(ACK) to the SYN message sent by client,along with it's own synchronization request(SYN) - SYN/ACK. The connection is now half open.

Along with the ACK flag, the server includes the client's sequence number incremented by one. The purpose of ACK flag is to acknowledge the data the client referred by it's sequence number. The server acknowledges this by incrementing client's sequence number - sequence number plus one is the next data byte the server expects to receive. So now the client is free to throw away it's original SYN message since the server has acknowledged the receipt of it.

The server also sets the SYN flag in it's first message. Similar to the client's first message, the SYN flag is accompanied by a synchronization sequence number. The server is passing along it's own starting sequence number for it's half of the connection.

The first message is the only message the server will send with the SYN flag set. This and all subsequent messages have the ACK flag set. The presence of the ACK in all server messages, as compared to lack of an ACK flag in the client's first message is a critical difference.

The client machine receives the SYN/ACK message sent by server and replies with it's own acknowledgement(ACK). Now the TCP three-way handshake is complete and the connection is ESTABLISHED.

From here on, both the client and server set the ACK flag. The SYN flag won't be set by either program.

TCP three-way handshake can be briefed as follows:

1) Client end
    Client sends SYN

2) Server end
    Kernel hands SYN to Server
    Server calls accept()
    Kernel sends SYN/ACK to the client

3) Client end
   Client sends ACK

TCP Connection is ESTABLISHED.

Now the first http request data is passed from client to server.
In the server end, the kernel passes connection into the server's accept() method.

So, a  TCP connection for a typical HTTP request looks as follows

SYN (client->server), 
SYN/ACK (server->client), 
ACK (client->server) - TCP Handshake Complete. TCP Connection Established 

Now the request data is passed from client to server

HTTP Request Data (client->server)

Illustration with example

Let us illustrate it with an output of tcpdump command.

Suppose Iam accessing the website 74.125.236.183 (www.google.co.in) using curl command as follows


$ curl -IL 74.125.236.183
HTTP/1.1 200 OK
Date: Sun, 21 Apr 2013 07:48:49 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: NID=67=Y-udEdGqbMrB-O-d_Hw09tH2nhhZJ6LmLNt-LAGXVymAq1Fzejl5qFZFdK68DwpE6wxzVLZ7KJFTucQ2zIs6MLxD7KY3MiR2XBqcGvMrmBc5eUaMn-W6Tw7XStpL9QhI; expires=Mon, 21-Oct-2013 07:48:49 GMT; path=/; domain=.; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked

Using tcpdump command, the above transaction shall be captured as follows

# tcpdump -w google.pcap -i bond0 host 74.125.236.183

Press Ctrl+C once the above curl command is complete

Examine the tcpdump command output in file google.pcap as follows

Here, 192,168.1.33 is the client IP
          74.125.236.183 is the server IP

# tcpdump -nnr google.pcap

reading from file google.pcap, link-type EN10MB (Ethernet)
13:20:36.724978 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [S], seq 3247485552, win 14600, options [mss 1460,sackOK,TS val 1544059 ecr 0,nop,wscale 7], length 0
13:20:36.756286 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [S.], seq 3589360082, ack 3247485553, win 62392, options [mss 1430,sackOK,TS val 977680945 ecr 1544059,nop,wscale 6], length 0
13:20:36.756326 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 0

So far, TCP three-way handshake is complete and TCP connection is established between the client and the server.

After the TCP connection is established, now the first request data is passed from client to server
13:20:36.756402 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [P.], seq 1:171, ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 170

13:20:36.788779 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [.], ack 171, win 992, options [nop,nop,TS val 977680978 ecr 1544091], length 0
13:20:36.844585 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [P.], seq 1:598, ack 171, win 992, options [nop,nop,TS val 977681032 ecr 1544091], length 597
13:20:36.844603 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0

Data transfer is complete 

13:20:36.844773 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [F.], seq 171, ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0
13:20:36.875132 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [F.], seq 598, ack 172, win 992, options [nop,nop,TS val 977681064 ecr 1544179], length 0
13:20:36.875166 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 599, win 124, options [nop,nop,TS val 1544209 ecr 977681064], length 0


A very interesting read

http://igoro.com/archive/what-really-happens-when-you-navigate-to-a-url/

Saturday, April 20, 2013

List all installed Python modules/packages

To list all the installed python packages/modules

1) As super user(root), run the following command


# python -c "help('modules')"

Please wait a moment while I gather a list of all available modules...

BaseHTTPServer      chunk               invest              repr
Bastion             cmath               io                  resource
CDROM               cmd                 iotop               rexec
CGIHTTPServer       code                ipaclient           rfc822
CORBA               codecs              ipalib              rlcompleter
ConfigParser        codeop              ipapython           robotparser
Cookie              collections         itertools           rpm
Crypto              colorsys            iwlib               rpmUtils
DLFCN               commands            ixf86config         runpy
DocXMLRPCServer     compileall          json                scanext
HTMLParser          compiler            kerberos            scdate
IN                  contextlib          keyword             sched
MimeWriter          cookielib           krbV                sckdump
ORBit               copy                ldap                scservices
OpenSSL             copy_reg            ldapurl             select
PortableServer      cracklib            ldif                selinux
Queue               createrepo          lib2to3             sets
SSSDConfig          crypt               libiscsi            setuptools
SimpleHTTPServer    cryptsetup          libproxy            sgmllib
SimpleXMLRPCServer  csv                 libuser             sha
SocketServer        ctypes              libxml2             shelve
StringIO            cups                libxml2mod          shlex
TYPES               cupsext             linecache           shutil
UserDict            cupshelpers         linuxaudiodev       signal
UserList            curl                locale              site
UserString          curses              logging             slip
_LWPCookieJar       datetime            lxml                smbc
_MozillaCookieJar   dbhash              macpath             smtpd
__builtin__         dbm                 macurl2path         smtplib
__future__          dbus                mailbox             snack
_abcoll             dbus_bindings       mailcap             sndhdr
_anthy              decimal             mako                socket
_ast                decorator           markupbase          sos
_bisect             default_encoding_utf8 markupsafe          spwd
_bsddb              deltarpm            marshal             sqlite3
_bytesio            difflib             math                sqlitecachec
_codecs             dircache            md5                 sre
_codecs_cn          dis                 meh                 sre_compile
_codecs_hk          distutils           mhlib               sre_constants
_codecs_iso2022     dl                  mimetools           sre_parse
_codecs_jp          dmidecode           mimetypes           ssl
_codecs_kr          dmidecodemod        mimify              stat
_codecs_tw          doctest             mmap                statvfs
_collections        drv_libxml2         modulefinder        string
_cracklib           dsextras            multifile           stringold
_crypt              dsml                multiprocessing     stringprep
_csv                dumbdbm             mutex               strop
_ctypes             dummy_thread        netaddr             struct
_curses             dummy_threading     netrc               subprocess
_curses_panel       easy_install        new                 sunau
_dbus_bindings      egg                 nis                 sunaudio
_dbus_glib_bindings email               nntplib             symbol
_deltarpm           encodings           nose                symtable
_elementtree        errno               nss                 sys
_fileio             ethtool             ntpath              syslog
_functools          exceptions          nturl2path          system_config_keyboard
_gamin              fcntl               numbers             tabnanny
_hashlib            feedparser          numpy               talloc
_heapq              filecmp             opcode              tarfile
_hotshot            fileinput           operator            telnetlib
_json               firstboot           optparse            tempfile
_ldap               fnmatch             orca                termios
_locale             formatter           os                  test
_lsprof             fpformat            os2emxpath          textwrap
_multibytecodec     fractions           ossaudiodev         this
_multiprocessing    ftplib              packagekit          thread
_ped                functools           pango               threading
_random             future_builtins     pangocairo          time
_snack              gamin               paramiko            timeit
_socket             gc                  parser              timing
_sqlite3            gconf               parted              toaiff
_sqlitecache        gdbm                pcardext            token
_sre                genericpath         pdb                 tokenize
_ssl                getopt              pickle              trace
_strptime           getpass             pickletools         traceback
_struct             gettext             pip                 tty
_symtable           gio                 pipes               types
_threading_local    glib                pkg_resources       unicodedata
_warnings           glob                pkgutil             unittest
_weakref            gmenu               platform            uno
abc                 gnome               plistlib            unohelper
abrt_exception_handler gnomeapplet         popen2              urlgrabber
acutil              gnomecanvas         poplib              urllib
aifc                gnomekeyring        posix               urllib2
anthy               gnomevfs            posixfile           urlparse
anydbm              gobject             posixpath           user
array               gpgme               pprint              uu
ast                 grp                 profile             uuid
asynchat            gst                 pstats              vte
asyncore            gstoption           pty                 warnings
atexit              gtk                 pwd                 wave
atk                 gtksourceview2      py_compile          weakref
audiodev            gtkunixprint        pyatspi             webbrowser
audioop             gzip                pyclbr              webkit
base64              hashlib             pycryptsetup        whichdb
bdb                 heapq               pycurl              wireshark_be
beaker              hmac                pydoc               wireshark_gen
binascii            hotshot             pydoc_topics        wnck
binhex              hpmudext            pyexpat             wsgiref
bisect              htmlentitydefs      pygst               xdg
block               htmllib             pygtk               xdrlib
bonobo              httplib             pyhbac              xf86config
bsddb               ibus                pykickstart         xml
bz2                 idlelib             pynotify            xmllib
cPickle             ihooks              pysss               xmlrpclib
cProfile            imageop             pysss_murmur        xxsubtype
cStringIO           imaplib             quopri              yum
cairo               imghdr              random              yumutils
calendar            imp                 re                  zipfile
cas                 imputil             readline            zipimport
cgi                 iniparse            report              zlib
cgitb               inspect             reportclient

Enter any module name to get more help.  Or, type "modules spam" to search for modules whose descriptions contain the word "spam".

2) Alternatively, pip-python command too can be used to list the installed python packages/modules


$ pip-python freeze
Beaker==1.3.1
Mako==0.3.4
MarkupSafe==0.9.2
SSSDConfig==1.9.2
cas==0.15
cups==1.0
cupshelpers==1.0
decorator==3.0.1
distribute==0.6.10
ethtool==0.6
feedparser==5.0.1
firstboot==1.110
freeipa==2.0.0.alpha.0
iniparse==0.3.1
iotop==0.3.2
ipapython==3.0.0
iwlib==1.0
kerberos==1.0
lxml==2.2.3
netaddr==0.7.5
nose==0.10.4
numpy==1.4.1
paramiko==1.7.5
pyOpenSSL==0.10
pycrypto==2.0.1
pycryptsetup==0.0.11
pycurl==7.19.0
pygpgme==0.1
pykickstart==1.74.12
python-default-encoding==0.1
python-dmidecode==3.10.13
python-ldap==2.3.10
python-meh==0.11
python-nss==0.13
pyxdg==0.18
scdate==1.9.60
sckdump==2.0.5
scservices==0.99.45
scservices.dbus==0.99.45
slip==0.2.20
slip.dbus==0.2.20
slip.gtk==0.2.20
smbc==1.0
urlgrabber==3.9.1
yum-metadata-parser==1.1.2

Install python libraries/packages/modules in CentOS

To install python libraries/packages/modules in CentOS, there are two ways to do it

1) Using yum
2) Using pip (Python Package Index) - The python package pip provides a tool named pip-python for installing and managing Python packages

To install a python library/package/module using yum

For this purpose, EPEL(Extra Packages for Enterprise Linux) repo for yum need to be installed first.

1) Find the version of Linux


# lsb_release -a
LSB Version:    :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch
Distributor ID: CentOS
Description:    CentOS release 6.4 (Final)
Release:        6.4
Codename:       Final

2) Download and install the epel repo rpm relevant for the above version

a) # wget http://mirror-fpt-telecom.fpt.net/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm

b) # yum install epel-release-6-8.noarch.rpm

Once the EPEL yum repo is installed. any python package/module/library can be installed using yum like any other linux package

Say, for example, if I want to install feedparser python library/package/module, it can done as follows

# yum install python-feedparser

To install a python library/package/module using pip

1) pip is a python module. So, first python pip module need to be installed. To install the python module, EPEL repo on CentOS need to be enabled(How to enable EPEL repo is mentioned in above step).

2) Once EPEL repo is enabled, install pip as follows

      # yum install python-pip

This provides us a tool called pip-python, using which, the following can be done

a) Install a package

      # pip-python install packagename

b) Uninstall a package

      # pip-python uninstall packagename

c) To list all installed python packages/modules/libraries

      # pip-python freeze

d) To get help with pip-python command

       #  pip-python help

Flags in TCP Header


The 20-byte TCP header has seven inividual bit flags - URG, ACK, PSH, SYN, FIN, RST, Placeholder
While using tcpdump, these flags are represented as urg, ack, p, s, f, r, .

URG - urg - Indicates that the urgent pointer portion of the header should be examined. Urgent data should take precedence over other data. For example, pressing Ctrl-C to terminate a download.
ACK - ack -Indicates that the Acknowledgement number should be examined. Ack packet is used the acknowledge the receipt of data. This flag may appear in conjuction with other flags.
PSH - P -Indicates that the receiver should hand this data upto the next layer as soon as possible. Signals the immediate push of data from sending host to the receiving host.
SYN - S - Initiates a connection. SYN packet, a session establishment request. First part of any TCP connection.
FIN - F - Indicates that the sender(either client or server) is done sending data. It indicates the intention to terminate the existing connection to the other end.
RST - R - Indicates that the connection should be reset. It indicates the sender's intention to immediately abort the existing connection.
Placeholder - . - If the connection does not have a syn, finish, rest or push flag set, this placeholder flag will be found after the destination port. Note that it also appears in conjunction with the ack flag.

A sample tcpdump report, where we are accessing the host 74.125.236.183.80 on it's port 80 using curl command

For the following command

# curl -IL 74.125.236.183

the tcpdump report generated using the command

# tcpdump -w test.pcap -i bond0 host 74.125.236.183

# tcpdump -nnr test.pcap
13:20:36.724978 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [S], seq 3247485552, win 14600, options [mss 1460,sackOK,TS val 1544059 ecr 0,nop,wscale 7], length 0
13:20:36.756286 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [S.], seq 3589360082, ack 3247485553, win 62392, options [mss 1430,sackOK,TS val 977680945 ecr 1544059,nop,wscale 6], length 0
13:20:36.756326 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 0
13:20:36.756402 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [P.], seq 1:171, ack 1, win 115, options [nop,nop,TS val 1544091 ecr 977680945], length 170
13:20:36.788779 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [.], ack 171, win 992, options [nop,nop,TS val 977680978 ecr 1544091], length 0
13:20:36.844585 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [P.], seq 1:598, ack 171, win 992, options [nop,nop,TS val 977681032 ecr 1544091], length 597
13:20:36.844603 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0
13:20:36.844773 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [F.], seq 171, ack 598, win 124, options [nop,nop,TS val 1544179 ecr 977681032], length 0
13:20:36.875132 IP 74.125.236.183.80 > 192.168.1.33.41002: Flags [F.], seq 598, ack 172, win 992, options [nop,nop,TS val 977681064 ecr 1544179], length 0
13:20:36.875166 IP 192.168.1.33.41002 > 74.125.236.183.80: Flags [.], ack 599, win 124, options [nop,nop,TS val 1544209 ecr 977681064], length 0

How a TCP Connection is closed?


Closing a TCP connection is a four step process as opposed to three way process for connection establishment. The extra step is due to the full duplex nature of the TCP connection where both client and server may be sending data at any given time.

  1. TCP connection closing is initiated either by client or server, by sending a TCP segment with FIN flag set to the other end indicating that it is wishing to close the connection. So if client sends a TCP segment with FIN flag set to the server, then server is said to be on the CLOSE_WAIT state and client is said to be on  FIN_WAIT_1 state.
  2. After the FIN is received by the server, the server sends a TCP segment with ACK flag set to the client, incrementing the sequence number by one. Now the client goes into FIN_WAIT_2 state. The server also indicates to it's own higher layer protocols that the connection is terminated.
  3. The server closes the connection by sending a TCP segemnt with FIN flag set to the client. This causes the server to go into LAST_ACK state while the client goes into TIME_WAIT state.
  4. Finally, the client acknowledges the FIN sent by the server with an ACK and increments the sequence number by one. This causes the connection to go into CLOSED state.

Because TCP connections can be closed by either side, a TCP connection can exist in half-closed mode in which one end has initiated the FIN sequence but the other end has not done so.

RST - TCP connections can also be terminated with one end sending a TCP segment with RST(reset) flag set. This informs the other side to use an abortive release method. This is opposed to the normal termination of TCP connection sometimes referred to as orderly release.

Why TCP is called connection oriented protocol? TCP three-way handshake


Before exchanging data, a connection must be established between client and server using a three-way handshake. Hence TCP is called a connection oriented protocol.

  1. The client while initiating a connection with the server, sends a TCP segment with SYN flag set, as well as an Initial Sequence Number (ISN) and the port number of the server. This client is said to be in SYN_SENT state.
  2.  At this point, the server will make an entry for connection in the listen queue  and send back a reply that has both SYN and ACK flags set which indicates that the server acknowledges receiving the client's initial packet and wishes to establish a connection with the client. The server responds with a TCP segment with SYN flag and ACK flag set. Also, the server sets the ISN with a value one higher than the ISN sent by the client. This is referred as SYN-ACK packet or SYN-ACK segment. The server is said to be in SYN_RCVD state.
  3. The client then acknowledges the SYN-ACK packet be sending a TCP segment with ACK flag set and by incrementing ISN by one. This completes the three-way handshake and connection is said to be in ESTABLISHED state. Once the client responds, the connection  moves from listen queue into the connection queue.
For a system on a high-latency network that receives a large number of connection requests, there is a possibility of listen queue becoming full because of the time required for the clients to complete the connection.

With TCP, both client and server can send data at the same time, making TCP a full duplex protocol.

TCP three-way handshake can be briefed as follows:

1) Client end
    Client sends SYN

2) Server end
    Kernel hands SYN to Server
    Server calls accept()
    Kernel sends SYN/ACK to the client

3) Client end
   Client sends ACK

TCP Connection is ESTABLISHED.

Now in the server end, the kernel passes connection into the server's accept() menthod.


syslog, rsyslog - Centralized Logging


Basically logs generated by server daemons such as Apache or applications, are streams and not files, though the application/daemon have a configuration parameter for logging into a file.

Why logs are considered as streams? Because logs do not have beginning or end, they are just ongoing.

How Linux handles logs(streams)?

In Linux, stdout and stderr, are two default output streams, which are automatically available to all programs. These streams can be turned into files using a redirect operator.

An application which uses stdout for logging can be made to log to a file as follows
$ application >> /var/log/applogfile

Syslog:

So when log messages from an application can be directed to go to files(using redirection operator) or user terminals or run them through other programs (with a pipe - to email, pager, or just a log file analyzer), here comes a question of how to do distributed logging or logging to a centralized place from multiple hosts. syslog protocol comes to the aid here. Now we have modern logging protocols such as Scribe and Splunk for example.

As per RFC-3164:
In its most simplistic terms, the syslog protocol provides a transport to allow a machine to send event notification messages across IP networks to event message collectors - also known as syslog servers.  Since each process, application and operating system was written somewhat independently, there is little uniformity to the content of syslog messages.  For this reason, no assumption is made upon the formatting or contents of the messages.  The protocol is simply designed to transport these event messages.  In all cases, there is one device that originates the message.  The syslog process on that machine may send the message to a collector.  No acknowledgement of the receipt is made.

syslog protocol helps to send logs from many components to a single location. Programs which wish to use syslog protocol for logging, should have syslog awareness implemented in them. However, a program which used stdout for logging, can use syslog without needing to implement any syslog awareness into the program, by piping to the standard logger command.

$ myapplication | logger

However we can split the log stream to a local file as well as to syslog as follows

$  myapplication | tee /var/log/myapplication.log | logger

When set up to use a syslog server, devices will send their log messages over the network wire to the syslog server rather than recording them in a local file or displaying them.

Syslog is three things:

1) /dev/log, a UNIX-domain socket. Applications can connect to it and send it "messages".
2) UDP port 514, which is another service upon which applications can send messages. More importantly, this allows messages to be transferred from one host to another.
3) A program — sysklogd, syslog-ng, rsyslog, or one of several other variants of syslog — that listens on these sockets and ports, reads the messages, and decides where the messages should be sent, usually one or more files.

In my system, rsyslog program is being used and it uses a Unix domain socket

[root@dhcppc0 ~]# netstat -anp | grep rsyslog
unix  22     [ ]         DGRAM                    10559  1331/rsyslogd       /dev/log

rsyslog is an enhanced, multi-threaded syslog.

From here onwards, I shall be using rsyslog and syslog interchangeably as

What gets logged by rsyslogd daemon and where it goes is controlled by /etc/rsyslog.conf. A modern system uses rsyslog to centralize logging.

Here's how it works: A developer uses the rsyslog API function (or uses the logger program in shell scripts) to send log messages to syslogd. The information passed to rsyslogd includes the source of the log message (called a facility) and the priority of the log message.

rsyslogd then matches the facility and priority against selectors (combinations of facilities and priorities) in its configuration file. For the selector(s) that match the messages is sent to the corresponding destination(s).

Most log files go under /var/log directory. Besides the more specific log files, there is a general system log file usually called messages. Other important log files to monitor include: boot.log, dmesg (also the dmesg command), maillog, secure, wtmp (examine with the last command).

Log files contain sensitive information! You must protect these files by setting permisions carefully!

Log messages don't only have to go to files, you can direct them to user terminals, run them through other programs (with a pipe, to email, pager, or just a log file analyzer), or send them to another host running syslogd.

Following severity levels that can assigned to messages

0 - Emergency (emerg)
1 - Alerts (alert)
2 - Critical (crit)
3 - Errors (err)
4 - Warnings (warn)
5 - Notification (notice)
6 - Information (info)
7 - Debug (debug)

The messages can be categorized as follows. These things are also called as "facilities"

auth          The authorization system. Ex.: login, su, ftpd, rshd
authpriv User access messages use this
cron          Used by the cron facility
daemon Other daemon programs without a facility of their own
ftp          Used by ftp applications
kern          Kernel messages
lpr         The line printer spooling system
mail         Used by mail applications
mark         Used by syslogd to produce timestamps in log files
news         Used by news applications
security Same as auth. Should not be used anymore.
syslog messages from the syslog process itself
user         Messages generated by random user processes. Default.
uucp         UUCP messages
local0 – local7 Reserved for local use.
*         For all

In /etc/rsyslog.conf, we can specify

1)  Different severity levels can be specified for different facilities.
2)  Also, where the message goes(files, user, pipes)

The entry format is as follows

       facility.severity         log-file-name

Sample entries in /etc/rsyslog.conf file would look like as follows

1) Do NOT redirect facilities mail, authentication and cron and mail to /var/log/messages, look for the keyword none

       *.info;mail.none;authpriv.none;cron.none                /var/log/messages

2) The authpriv file has restricted access.

                  authpriv.*                                              /var/log/secure

3) Log all the mail messages in one place.

                    mail.*                                                  -/var/log/maillog

4)  Log cron stuff


                  cron.*                                                  /var/log/cron

5) To redirect all incoming messages from all facilities and with all severities to /var/log/syslog

                     *.*            -/var/log/syslog

6) To filter out messages with severity critical and save to file /var/log/critical

                       *.crit           -/var/log/critical

What's that dash in front of the filenames in /etc/rsyslog.conf file?

This is to avoid rsyslogd or syslog daemon becoming a bottleneck for the performance of the system. rsyslogd/syslogd daemon uses fsync() to flush very file write. This is done to reduce the chances of log information getting lost before being written to the disk in case of a crash. By prefixing the names of the log files with a dash/hyphen, we specify that log file writes should not be flushed to disk immediately.

How to configure rsyslog for centralized logging?


Central Log Host

1) In the central log host, first setup "rsyslogd" to accept remote messages. This shall be done by uncommenting the following lines under "Modules" section in  "/etc/rsyslog.conf" file in central log host as follows
          # Provides UDP syslog reception
          $ModLoad imudp
          $UDPServerRun 514

Opening up UDP port 514 for receiving messages from remote servers. 
UDP/TCP capability is not enabled by default for rsyslog. Hence we need to load the modules imudp(for UDP) and imtcp(for TCP) to enable UDP or TCP support.

2) Restart syslogd in central log host
           service rsyslog restart
   
   Verify if rsyslog is listening on UDP port 514

   [root@dhcppc0 etc]# netstat -tulnp | grep 514
   udp        0      0 0.0.0.0:514                 0.0.0.0:*                               3826/rsyslogd
   udp        0      0 :::514                      :::*                                    3826/rsyslogd

Now the central machine shall start accepting log messages from other machines on UDP port 514

Remote Host

Now the client machines need to be configured to send it's log messages to central log host listening on udp port 514. 

1) In client machines, edit the file /etc/rsyslog.conf as follows
        user.*     @central_host_ip:514

2) Then restart the syslogd service in client machines
         service rsyslog restart

3) Test the working by running the logger command in another machine

       logger -i -t <yourname> "This is test from client"  E.g. logger -i -t root  "This is a test from client"

In the central log host, check /var/log/messages for an entry This is a test from client

Get HTTP status code using curl from command line

Suppose, I want to test the availability of a website from command line, curl comes to the aid.

The following curl options are really very useful


 -s/--silent
              Silent or quiet mode. Don't show progress meter or error messages.  Makes Curl mute.


-I/--head
              (HTTP/FTP/FILE)  Fetch  the  HTTP-header only! HTTP-servers feature the command HEAD which this uses to get nothing but the header of a document. When used on a FTP or FILE file, curl displays the file size and last modification time only.

 -L/--location
              (HTTP/HTTPS)  If  the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code), this option will make curl redo the request on the new place. If used  together  with  -i/--include  or  -I/--head, headers  from all requested pages will be shown.


Let us illustrate with an example


$ curl -I google.com

HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Apr 2013 14:14:56 GMT
Expires: Mon, 20 May 2013 14:14:56 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN

So here the HTTP status code while accessing the url, google.com, is 301, indicating a redirect. However, we do not have any info about the url to which google.com redirects. So we use the -L option of curl command as follows.

$ curl -IL google.com


HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 20 Apr 2013 14:15:00 GMT
Expires: Mon, 20 May 2013 14:15:00 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN

HTTP/1.1 302 Found
Location: http://www.google.co.in/
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Set-Cookie: PREF=ID=34355ea70d42cfd3:FF=0:TM=1366467300:LM=1366467300:S=lf21z4mtM-zoJzkp; expires=Mon, 20-Apr-2015 14:15:00 GMT; path=/; domain=.google.com
Set-Cookie: NID=67=U_DZQR302SSX-7TZo3M6w0aaSBgj6l32BvBjrRO1i4Sk8Ecy6YzKDK5HBewGsgf5bB4sQI_PVRzCeeYfkUlT10X57aqV7jGBFGUx9JcAMZ0rjbIFNggpQULTqCAjil_n; expires=Sun, 20-Oct-2013 14:15:00 GMT; path=/; domain=.google.com; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Date: Sat, 20 Apr 2013 14:15:00 GMT
Server: gws
Content-Length: 221
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN

HTTP/1.1 200 OK
Date: Sat, 20 Apr 2013 14:15:01 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=61af3f5ebcc415a0:FF=0:TM=1366467301:LM=1366467301:S=76cA-ULyN5yq8JmR; expires=Mon, 20-Apr-2015 14:15:01 GMT; path=/; domain=.google.co.in
Set-Cookie: NID=67=uTayihoLkuwaP5vaOmQHcWVs9jJVOdKg1JucLn7Vcybi_t0_b25EZw6qAvPNYRQLkTX0Y56P5YSUWjEyd6EmcZWkPTBhxSs7jrdf0ndfpCpYcrLspEVt9SG2WCRyxzCh; expires=Sun, 20-Oct-2013 14:15:01 GMT; path=/; domain=.google.co.in; HttpOnly
P3P: CP="This is not a P3P policy! See http://www.google.com/support/accounts/bin/answer.py?hl=en&answer=151657 for more info."
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Transfer-Encoding: chunked


To just get the status code alone without any header information

curl -sL -w "%{http_code} %{url_effective}\\n" "URL" -o /dev/null

$ curl -sL -w "%{http_code} %{url_effective}\\n" "http://here.com" -o /dev/null
200 http://here.com

Saturday, April 13, 2013

Python : Show method(s) available for an object/module

In python, I open a file for reading as follows

f = open("/var/log/messages")

where, f is a file object.

So for performing different file operations, I would like to know the methods available for the file object f.

dir(object) comes to the aid here.

So to know the methods available for file object f

#!/usr/bin/python

f = open("/var/log/messages")
print dir(f)

The output of the above script shall be as follows

['__class__', '__delattr__', '__doc__', '__enter__', '__exit__', '__format__', '__getattribute__', '__hash__', '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'close', 'closed', 'encoding', 'errors', 'fileno', 'flush', 'isatty', 'mode', 'name', 'newlines', 'next', 'read', 'readinto', 'readline', 'readlines', 'seek', 'softspace', 'tell', 'truncate', 'write', 'writelines', 'xreadlines']

The above script can be modified as follows, to display the method names returned as a list, one per line



#!/usr/bin/python

f = open("/var/log/messages")


for method in dir(f):
     if hasattr(f,method):
        print method

__class__
__delattr__
__doc__
__enter__
__exit__
__format__
__getattribute__
__hash__
__init__
__iter__
__new__
__reduce__
__reduce_ex__
__repr__
__setattr__
__sizeof__
__str__
__subclasshook__
close
closed
encoding
errors
fileno
flush
isatty
mode
name
newlines
next
read
readinto
readline
readlines
see
softspace
tell
truncate
write
writelines
xreadlines

To know the methods available for a module, dir() again comes to the aid


import moduleName
dir(moduleName)

Let us illustrate with an example. Suppose I want to find the methods available for module os

#!/usr/bin/python
import os

print dir(os)

The output of above script is as follows

['EX_CANTCREAT', 'EX_CONFIG', 'EX_DATAERR', 'EX_IOERR', 'EX_NOHOST', 'EX_NOINPUT', 'EX_NOPERM', 'EX_NOUSER', 'EX_OK', 'EX_OSERR', 'EX_OSFILE', 'EX_PROTOCOL', 'EX_SOFTWARE', 'EX_TEMPFAIL', 'EX_UNAVAILABLE', 'EX_USAGE', 'F_OK', 'NGROUPS_MAX', 'O_APPEND', 'O_ASYNC', 'O_CREAT', 'O_DIRECT', 'O_DIRECTORY', 'O_DSYNC', 'O_EXCL', 'O_LARGEFILE', 'O_NDELAY', 'O_NOATIME', 'O_NOCTTY', 'O_NOFOLLOW', 'O_NONBLOCK', 'O_RDONLY', 'O_RDWR', 'O_RSYNC', 'O_SYNC', 'O_TRUNC', 'O_WRONLY', 'P_NOWAIT', 'P_NOWAITO', 'P_WAIT', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'ST_APPEND', 'ST_MANDLOCK', 'ST_NOATIME', 'ST_NODEV', 'ST_NODIRATIME', 'ST_NOEXEC', 'ST_NOSUID', 'ST_RDONLY', 'ST_RELATIME', 'ST_SYNCHRONOUS', 'ST_WRITE', 'TMP_MAX', 'UserDict', 'WCONTINUED', 'WCOREDUMP', 'WEXITSTATUS', 'WIFCONTINUED', 'WIFEXITED', 'WIFSIGNALED', 'WIFSTOPPED', 'WNOHANG', 'WSTOPSIG', 'WTERMSIG', 'WUNTRACED', 'W_OK', 'X_OK', '_Environ', '__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__', '_copy_reg', '_execvpe', '_exists', '_exit', '_get_exports_list', '_make_stat_result', '_make_statvfs_result', '_pickle_stat_result', '_pickle_statvfs_result', '_spawnvef', 'abort', 'access', 'altsep', 'chdir', 'chmod', 'chown', 'chroot', 'close', 'closerange', 'confstr', 'confstr_names', 'ctermid', 'curdir', 'defpath', 'devnull', 'dup', 'dup2', 'environ', 'errno', 'error', 'execl', 'execle', 'execlp', 'execlpe', 'execv', 'execve', 'execvp', 'execvpe', 'extsep', 'fchdir', 'fchmod', 'fchown', 'fdatasync', 'fdopen', 'fork', 'forkpty', 'fpathconf', 'fstat', 'fstatvfs', 'fsync', 'ftruncate', 'getcwd', 'getcwdu', 'getegid', 'getenv', 'geteuid', 'getgid', 'getgroups', 'getloadavg', 'getlogin', 'getpgid', 'getpgrp', 'getpid', 'getppid', 'getsid', 'getuid', 'isatty', 'kill', 'killpg', 'lchown', 'linesep', 'link', 'listdir', 'lseek', 'lstat', 'major', 'makedev', 'makedirs', 'minor', 'mkdir', 'mkfifo', 'mknod', 'name', 'nice', 'open', 'openpty', 'pardir', 'path', 'pathconf', 'pathconf_names', 'pathsep', 'pipe', 'popen', 'popen2', 'popen3', 'popen4', 'putenv', 'read', 'readlink', 'remove', 'removedirs', 'rename', 'renames', 'rmdir', 'sep', 'setegid', 'seteuid', 'setgid', 'setgroups', 'setpgid', 'setpgrp', 'setregid', 'setreuid', 'setsid', 'setuid', 'spawnl', 'spawnle', 'spawnlp', 'spawnlpe', 'spawnv', 'spawnve', 'spawnvp', 'spawnvpe', 'stat', 'stat_float_times', 'stat_result', 'statvfs', 'statvfs_result', 'strerror', 'symlink', 'sys', 'sysconf', 'sysconf_names', 'system', 'tcgetpgrp', 'tcsetpgrp', 'tempnam', 'times', 'tmpfile', 'tmpnam', 'ttyname', 'umask', 'uname', 'unlink', 'unsetenv', 'urandom', 'utime', 'wait', 'wait3', 'wait4', 'waitpid', 'walk', 'write']


ACL : How to enable read permission for /var/log/messages for ordinary user in Linux?


By default, /var/log/messages file can be accessed only by super user(root). The ordinary user does not even have read permission for this file.

# ls -l /var/log/messages
-rw------- 1 root root 658711 Apr 14 05:52 /var/log/messages

So how to enable read permission for an ordinary user, say xyz, for the file /var/log/messages?

Access Control Lists(acl) comes to the aid by allowing us to provide different levels of access to files and directories.

How to enable acl for Linux filesystem?


1) Install command line tool, acl, first. This package has Access Control List utilities.
   # yum install acl

2) Mount the partition with acl option enabled. Edit /etc/fstab as follows

UUID=fffff7aa-57b8-40aa-baa4-588c4eff7651   /  ext4    defaults,acl        1 1

3) Reboot the system for mount options to take effect.

Enable read access for user xyz for the file /var/log/messages


1) setfacl - Sets file access control list.
    
    # setfacl -m u:xyz:r /var/log/messages

2) Check the new file permissions for /var/log/messages
    
     # ls -l /var/log/messages
    -rw-r-----+ 1 root root 658711 Apr 14 05:52 /var/log/messages

Observe that now a + is observed at the end of file permissions.

3) Verify the access permissions for the file /var/log/messages using getfacl command

     getfacl - Get file access control list

  # getfacl /var/log/messages
  getfacl: Removing leading '/' from absolute path names
  # file: var/log/messages
  # owner: root
  # group: root
  user::rw-
  user:xyz:r--
  group::---
  mask::r--
  other::---

ext4 : mount options for ext4 file system in /etc/fstab

In /etc/fstab, usually the mount option is mentioned as defaults, like follows


UUID=fffff7aa-57b8-40aa-baa4-588c4eff7651          /              ext4    defaults        1 1
UUID=8b5a0a93-1dd3-4394-bb3e-0032a77201fa     /boot       ext4    defaults        1 2

What does this option defaults stand for in ext4 file system?

The default options for ext4 file system are: rw, suid, dev, exec, auto, nouser, async

Different file system mount options available are

auto       - Mount automatically at boot, or when the command mount -a is issued.
noauto   - Mount only when you tell it to.
exec      - Allow execution of binaries on the filesystem.
noexec  - Disallow execution of binaries on the filesystem.
ro         - Mount the filesystem read-only.
rw        - Mount the filesystem read-write.
user      - Allow any user to mount the filesystem. This automatically implies noexec, nosuid, nodev, unless overridden.
users    - Allow any user in the users group to mount the filesystem.
nouser  - Allow only root to mount the filesystem.
owner  - Allow the owner of device to mount.
sync     - I/O should be done synchronously.
async   - I/O should be done asynchronously.
dev      - Interpret block special devices on the filesystem.
nodev  - Don't interpret block special devices on the filesystem.
suid    - Allow the operation of suid, and sgid bits. They are mostly used to allow users on a computer system to execute binary executables with temporarily elevated privileges in order to perform a specific task.
nosuid - Block the operation of suid, and sgid bits.
noatime     - Don't update inode access times on the filesystem. Can help performance
nodiratime - Do not update directory inode access times on the filesystem. Can help performance 
relatime     - Update inode access times relative to modify or change time. Access time is only updated if the previous access time was earlier than the current modify or change time. Similar to noatime. Can help performance.
flush        - The vfat option to flush data more often, thus making copy dialogs or progress bars to stay up until all data is written  
acl         - Enable Access Control List(acl) for filesystem

Wednesday, April 3, 2013

Python : Loop through alphabets a to z

To loop through the alphabets a to z and print them, here is a simple script in python


#!/usr/bin/python

import string

alphabets = string.ascii_lowercase

for i in alphabets:
    print i

Output

a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
s
t
u
v
w
x
y
z