Sunday, October 14, 2012

CPU Caches


Each CPU has it's own cache. Each cache has it's own associated cache controller. Data must be in CPU cache before the work can begin.

Cache memory is organized into lines. Each line of cache can be used to cache a specific location in memory.

A cache may be for

  • instructions for the processors, the I-cache
  • data, the D-cache
  • both instructions and data
Cache Hit
When the CPU makes a reference to the main memory, the cache controller first checks to see if the requested address is in cache. If the requested memory is in cache it is referred as cache-hit.

Cache Miss
When the CPU makes a reference to the main memory, and if the cache controller does not see the requested address in cache, then a cache-miss occurs.

Cache Line Fill
When a cache-miss occurs, the requested memory location must be read from main memory and brought into cache. The process of moving data from RAM to cache is this known as cache line fill.

The processor both reads and writes into cache memory. So when something is written to cache, it must be updated in main memory(RAM) too. The write operation can be configured in two ways
  • write-through
  • write-back
If write-through caching is enabled, then when a particular line of cache is updated, the corresponding location in memory is updated as well.
If write-back caching is enabled, a write to cache line does not get written back to main memory until the cache line is deallocated.
write-back caching is more efficient than write-through caching.

CPU Cache Types

  • Direct Mapped Cache : This is the least expensive type of cache memory. Each line of direct mapped cache can only cache a specific location in memory.
  • Fully Associatve Cache Memory : Most flexible type of cache memory and consequently the most expensive one. In fully associative cache, a line can cache any location in main memory.
  • Set Associative Cache Memory : Set associative memory provides a good compromise between direct mapped cache and fully associative cache. Most systems use this type of cache. Set associative cache memory is usually referred to as n-way set associative where n is some power of 2. Set associative cache memory allows a memory location to be cached into any one of the n lines of cache.
There can be multiple levels of cache, like L1, L2.

To calculate cache in our system

# x86info -c
Found 2 CPUs
--------------------------------------------------------------------------
CPU #1
Cache info
 L1 Instruction cache: 32KB, 8-way associative. 64 byte line size.
 L1 Data cache: 32KB, 8-way associative. 64 byte line size.
 L2 cache: 1MB, sectored, 8-way associative. 64 byte line size.
--------------------------------------------------------------------------
CPU #2
Cache info
 L1 Instruction cache: 32KB, 8-way associative. 64 byte line size.
 L1 Data cache: 32KB, 8-way associative. 64 byte line size.
 L2 cache: 1MB, sectored, 8-way associative. 64 byte line size.

# getconf -a | grep -i cache
LEVEL1_ICACHE_SIZE                 32768
LEVEL1_ICACHE_ASSOC                8
LEVEL1_ICACHE_LINESIZE             64
LEVEL1_DCACHE_SIZE                 32768
LEVEL1_DCACHE_ASSOC                8
LEVEL1_DCACHE_LINESIZE             64
LEVEL2_CACHE_SIZE                  1048576
LEVEL2_CACHE_ASSOC                 8
LEVEL2_CACHE_LINESIZE              64
LEVEL3_CACHE_SIZE                  0
LEVEL3_CACHE_ASSOC                 0
LEVEL3_CACHE_LINESIZE              0
LEVEL4_CACHE_SIZE                  0
LEVEL4_CACHE_ASSOC                 0

Profiling Cache Usage using valgrind

The following command helps in simulating cache usage

# valgrind --tool=cachegrind <programname>


Saturday, October 6, 2012

Apache Memory Leak and MaxRequestsPerChild


In many cases. Apache will leak memory. Because of memory leak, Apache will not be able to fork new child processes to serve new requests. However httpd still can answer requests with pre-allocated memory.

To solve this memory leak in Apache, usually we need to restart Apache service to free up memory. In case of frequent memory leaks, to restart the Apache 

  1. We can set a cron job to restart apache every "n" hours Or
  2. If logrotation is in place for Apache, during logrotation of Apache log files, the memory will get freed up

However, an effective way to deal with Apache memory leak is by setting an appropriate value for MaxRequestsPerChild directive in Apache.

MaxRequestsPerChild:


The MaxRequestsPerChild directive sets the limit on the number of requests that an individual child server process will handle. After MaxRequestsPerChild requests, the child process will die. 

It is set to 0 by default, the child process will never expire. It is appropriate to set this to a value of few thousands. This can help prevent memory leakage, since the process dies after serving a certain number of requests. Don't set this too low, since creating new processes does have overhead.

For example, set MaxRequestsPerChild 2000 in Apache conf file. Means that after 2000 requests the worker process is shut down and therefore frees up the memory it leaked and locked before.

Bash Socket Programming : Scan ports using Bash shell's /dev/tcp


In bash shell, a network socket can be opened to pass data through it.
A tcp socket as well as a udp socket can be opened using either of the following

  • /dev/tcp/host/port - The host should be a valid hostname or Internet address, and port should be an integer number or service name. bash attempts to open a TCP connection to the corresponding socket.
  • /dev/udp/host/port - The host shall be a valid hostname or Internet address, and port shall be an integer port number or service name. bash attempts to open a UDP connection to the corresponding socket.

Let us see how to use it for port scanning. Say for example, if we need to find if port 80 is open or closed for google.com, run the following command

$ echo >/dev/tcp/google.com/80 && echo "port 80 is open" || echo "port 80 is closed"

In case if the port is open, the output is received quickly. However, if the port is closed, it takes a long time to get an output.
So we can use the "timeout" utility to  exit the command in case the response is not received within specified time limit.

$ timeout 1 bash -c "echo >/dev/tcp/$host/$port" && echo "port $port is open" || echo "port $port is closed"

  •  bash -c : If the -c option is present,  then  commands  are  read from the string. 

To scan a range of ports, say for example port 1 to 100, do the following

for port in {1..100}
do
    timeout 1 bash -c "echo >/dev/tcp/google.com/$port" && echo "port $port is open" || echo "port $port is closed"
done

Friday, October 5, 2012