Calculator with awk

In your bash, do the following

1
calc(){ awk "BEGIN{ print $* }" ; }

and then do the following:

2
calc 2*1 - 12

or

3
calc '2*(1 - 12)'

you can put this in your ~/.bash_profile too

Command – traceroute

Traceroute is a command which can show you the path a packet of information takes from your computer to one you specify. It will list all the routers it passes through until it reaches its destination, or fails to and is discarded. In addition to this, it will tell you how long each hop from router to router takes.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
[root@sjc14-esx2-vm3 ~]# traceroute iph.csi2.c3w.tv
traceroute to iph.csi2.c3w.tv (192.118.77.180), 30 hops max, 40 byte packets
 1  172.29.96.1 (172.29.96.1)  3.150 ms  3.141 ms  3.135 ms
 2  sjc14-00lab-gw1-gig1-4.cisco.com (172.24.114.181)  3.120 ms  3.105 ms  3.094 ms
 3  sjc12-lab4-gw1-ten6-7.cisco.com (172.24.95.29)  3.086 ms  3.078 ms  3.067 ms
 4  sjc5-sbb4-gw1-ten8-6.cisco.com (171.71.241.174)  3.049 ms  3.045 ms  3.037 ms
 5  sjc12-rbb-gw4-ten7-4.cisco.com (171.71.241.254)  3.003 ms  2.997 ms  2.988 ms
 6  sjc12-gb1-ten2-2.cisco.com (10.112.4.157)  2.974 ms  3.924 ms  3.914 ms
 7  capnet-rtp10-sjc12-10ge.cisco.com (10.112.4.162)  78.649 ms  78.646 ms  79.351 ms
 8  rtp5-rbb-gw1-ten4-6.cisco.com (10.112.4.106)  81.145 ms  81.143 ms  81.863 ms
 9  rtp5-gb2-ten2-1.cisco.com (10.112.3.77)  79.921 ms  79.917 ms  79.911 ms
10  capnet-amsidc-rtp5-oc48.cisco.com (10.112.4.114)  167.285 ms  167.277 ms  167.263 ms
11  amsidc-rbb-gw2-ten2-1.cisco.com (10.112.4.202)  167.197 ms  167.194 ms  166.839 ms
12  amsidc-wan-gw1-ten6-2.cisco.com (144.254.78.14)  168.174 ms  167.671 ms  167.639 ms
13  amsidc-cw-pe1-oc48.cisco.com (10.61.40.18)  167.006 ms  167.087 ms  167.075 ms
14  ntn01-wan-gw1-ser1-0.cisco.com (144.254.136.193)  256.178 ms  256.308 ms  256.298 ms
15  ntn01-bb-gw2-gig2-7.cisco.com (64.103.115.205)  255.074 ms  254.828 ms  255.553 ms
16  ntn01-corp-gw1-gig0-2.cisco.com (64.103.116.14)  254.310 ms  254.302 ms  254.297 ms
17  ntn01-dmzbb-gw1-gig2-43.cisco.com (192.118.78.166)  260.572 ms  260.568 ms  257.431 ms
18  ntn01-dmznet-gw1-gig1-1.cisco.com (192.118.78.86)  254.664 ms  254.664 ms  254.655 ms
19  ntn01-dmzlab-gw1-gig1-1.cisco.com (192.118.76.26)  256.993 ms  257.295 ms  256.258 ms
20  csi-scp-dmz-gw.cisco.com (192.118.76.106)  257.160 ms  257.138 ms  256.874 ms
21  csi-scp-dmz-gw.cisco.com (192.118.76.106)  257.300 ms !X * *

WireShark Filter Example

1
ip.addr == 172.29.96.30 and http and http.request.method == GET

Clear DNS Cache on Linux

To Invalidate /etc/hosts cache, aka, clear DNS cache.

1
nscd -i hosts

Parsing (X)HTML in C – A libxml2 tutorial

Parsing (X)HTML in C is often seen as a difficult task.  It’s true that C isn’t the easiest language to use to develop a parser.  Fortunately, libxml2’s HTMLParser module come to the rescue.  So, as promised, here’s a small tutorial explaining how to use libxml2’s HTMLParser to parse (X)HTML.
First, you need to create a parser context.  You have many functions for doing that, depending on how you want to feed data to the parser.  I’ll use htmlCreatePushParserCtxt(), since it work with memory buffers.

1
htmlParserCtxtPtr parser = htmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL, 0);

Then, you can set many options on that parser context.

2
htmlCtxtUseOptions(parser, HTML_PARSE_NOBLANKS | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET);

We are now ready to parse an (X)HTML document.

3
4
5
6
// char * data : buffer containing part of the web page
// int len : number of bytes in data
// Last argument is 0 if the web page isn’t complete, and 1 for the final call.
htmlParseChunk(parser, data, len, 0);

Once you’ve pushed it all your data, you can call that function again with a NULL buffer and ’1′ as the last argument.  This will ensure that the parser have processed everything.

Finally, how to get the data you parsed?  That’s easier than it seems.  You simply have to walk the XML tree created.

7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void walkTree(xmlNode * a_node)
{
  xmlNode *cur_node = NULL;
  xmlAttr *cur_attr = NULL;
  for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
     // do something with that node information, like… printing the tag’s name and attributes
    printf(“Got tag : %s\n”, cur_node->name);
    for (cur_attr = cur_node->properties; cur_attr; cur_attr = cur_attr->next) {
      printf(-> with attribute : %s\n”, cur_attre->name);
    }
    walkTree(cur_node->children);
  }
}
 
walkTree(xmlDocGetRootElement(parser->myDoc));

And that’s it!  Isn’t that simple enough?  From there, you can do any kind of stuff, like finding all referenced images (by looking at “img” tag) and fetching them, or anything you can think of doing.

Also, you should know that you can walk the XML tree anytime, even if you haven’t parsed the whole (X)HTML document yet.

If you have to parse (X)HTML in C, you should use libxml2’s HTMLParser.  It will save you a lot of time.

Posted via email from feinan’s posterous

Print All Emails in Web Pages

1
2
3
4
5
#!/bin/bash
for ((i=1;i<=20;i++))
do
  wget -q -O - http://www.mitbbs.com/article_t1/Immigration/31933935_0_$i.html | grep -o '[[:alnum:]+\.\_\-][[:alnum:]+\.\_\-]*@[[:alnum:]+\.\_\-]*[[:alnum:]+]' | sort | uniq
done

Sort File by Given Column

1
2
3
4
5
6
7
8
sort -r +2 -3 infile
+m Start at the first character of the m+1th field.
-n End at the last character of the nth field (if -N omitted, assume the end of the line).
-f Make all lines uppercase before sorting (so "Bill" and "bill" are treated the same).
-r Sort in reverse order (so "Z" starts the list instead of "A").
-n Sort a column in numerical order
-tx Use x as the field delimiter (replace x with a comma or other character).
-u Suppress all but one line in each set of lines with equal sort fields (so if you sort on a field containing last names, only one "Smith" will appear even if there are several).

Print Even Lines in a File

1
awk 'NR%2==0' infile

Recursively untar all zip in directory

1
for i in `ls *.tgz`; do  tar zxvf $i; done

List Word Occurrence for A Given Document

1
2
3
4
5
6
7
8
9
10
11
12
cat nautilus-debug-log.txt | tr -cs A-Za-z '\012' | tr A-Z a-z | sort | uniq -c | sort -r -n | sed 25q
  2660 x
  2659 user
  2659 to
  2659 signal
  2659 log
  2659 dumped
  2659 due
  2659 debug
     1 milestones
     1 begin
     1

Previous Older Entries