<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Kukkaisvoima version 9" -->
<rss version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<channel>
<atom:link href="http://www.nickcoleman.org/blog/index.cgi/feed" rel="self" />
<title>Nick Coleman: internet</title>
<link>http://www.nickcoleman.org/blog/index.cgi</link>
<description>Nick Coleman blog</description>
<pubDate>Thu, 10 May 2012 11:49:00 -0700</pubDate>
<lastBuildDate>Thu, 10 May 2012 11:49:00 -0700</lastBuildDate>
<generator>http://23.fi/kukkaisvoima/</generator>
<language>en</language>
<item>
<title>I Moved From Google-chrome To Firefox
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=googlechrome-to-firefox%21201205101149%21general%2Cinternet</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=googlechrome-to-firefox%21201205101149%21general%2Cinternet#comments</comments>
<pubDate>Thu, 10 May 2012 11:49:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>general</category>
<category>internet</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=googlechrome-to-firefox%21201205101149%21general%2Cinternet/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
Several weeks ago, I changed my default browser from Google-chrome to Firefox.  Here's
why.
<p>
I had been using Google-chrome for over a year and had been a pretty happy user.  I
liked the Google-chrome user interface and Google had clearly put a lot of thought into how a
user interacts with their browser.  Neat touches were everywhere.
<p>
I did have some niggling privacy concerns, after all Google-chrome is created by the biggest
data mining company in the world, but I figured Google had too much to lose by ignoring
privacy in a browser.  That is especially so for a browser whose source code is
available to all.
<p>
Then I discovered that Google-chrome forbids the Ghostery add-on from blocking the
Doubleclick network. Ghostery is a popular add-on that prevents third-party sites from
monitoring your browsing, and Doubleclick is a monitoring and advertising network owned
by Google.
<p>
Apparently, Google has no problem with bending the privacy rules to advantage itself.<a
href="#fn1" id="back1"><sup>1</sup></a>  Time to give Firefox a go.
<p>
I hadn't used Firefox seriously for years and was pleasantly surprised.  It starts
fairly quickly and is not a huge memory pig like Google-chrome.  It is a little plain out of
the box, but some add-ons soon fixed that.
<p>
A few months later, this is what I am running.  For privacy:
<ul>
<li><strong>adblock-plus</strong>, to block ads</li>
<li><strong>ghostery</strong>, to block third-party cookies and scripts</li>
<li><strong>noscript</strong>, to block any javascript</li>
<li><strong>requestpolicy</strong>, to block any third-party reference</li>
<li><strong>refcontrol</strong>, to control the <a href="http://www.nickcoleman.org/axs/ax.pl?http://en.wikipedia.org/wiki/HTTP_referer" title="Wikipedia: Referer header"><em>referer header</em></a> so that my browsing history doesn't leak to third parties, and </li>
<li><strong>cookiemonster</strong>, to control cookies.  </li>
</ul>
<p>
Then I wanted to enhance Firefox to give a good user experience like Google-chrome.  These
plugins did the trick:
<ul>
<li><strong>add to searchbar</strong>, add new search engines to the search toolbar</li>
<li><strong>customizable shortcuts</strong>, create your own keystroke shortcuts</li>
<li><strong>duplicate in tab context menu</strong>, duplicate a tab</li>
<li><strong>firegestures</strong>, similar to vimium but light-weight, uses Vim keystrokes to control the browser, </li>
<li><strong>open link in ...</strong>, adds extra options to the right-click context menu, and</li>
<li><strong>sessionmanager</strong>, save and load sessions by name, and keeps a history.</li>
</ul>
<p>
I now have the same facilities in Firefox that I had in Google-chrome, plus it uses about a
third the memory and addresses all my privacy niggles.
<p>
(I've used the exact name of the add-ons above so you can search for them easily.)
<p>
<div class="footnote">
<a href="#back1" id="fn1">[1] </a>I may be attributing malice to something that is
merely a by-product of how Google-chrome handles plug-ins. Google-chrome apparently does not guarantee
the order that various plug-ins see the DOM, which means that other plug-ins may have
applied their magic before Ghostery, thereby removing any Doubleclick references before Ghostery sees
it. Occam's razor probably applies.   <a href="#back1">&#8593;</a>
</div>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=googlechrome-to-firefox%21201205101149%21general%2Cinternet/feed/</wfw:commentRss>
</item>
<item>
<title>iiNet Wins High Court Appeal by Movie Studios
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=iinet-piracy-high-court%21201204200844%21internet%2Claw</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=iinet-piracy-high-court%21201204200844%21internet%2Claw#comments</comments>
<pubDate>Fri, 20 Apr 2012 08:44:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>internet</category>
<category>law</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=iinet-piracy-high-court%21201204200844%21internet%2Claw/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
In great news for anyone opposed to the heavy handed actions of the movie industry in
attempting to prevent piracy, the High Court has ruled that iiNet has no direct power to
prevent its users from downloading pirated content.
<p>
The music and movie industry had previously sued iiNet for, in effect, "authorising"
illegal downloads by not preventing its users from doing so.  They lost that case,
appealed, and lost the appeal.  They then appealed to the High Court.  They have lost
that and the matter is closed.
<p>
Until the government changes the law.  Which they will undoubtedly do, since they are
about to sign the <a href="http://www.nickcoleman.org/axs/ax.pl?http://en.wikipedia.org/wiki/Trans-Pacific_Strategic_Economic_Partnership" title="Wikipedia TPP">TPP </a> without any public discussion.  The U.S. has previously
bullied trade partners into changing their copyright enforcement laws to favour the U.S.
movie industry.  There is no doubt in my mind that they will do this again in the latest
TPP round.  The fact that the government is super quiet on the TPP makes me think they
already know that they are going to have to sell out Australia's citizens.
<p>
The High Court decision does not mean that an ISP is immune, however.  One part of the Federal Court
appeal's result was that it provided a set process whereby an ISP could be held liable.  The movie industry
lost the appeal because the process they followed did not lead to iiNet's liability.
<p>
Nevertheless, for the moment, we are protected from such silliness as losing our
internet connection or being sued for thousands of dollars because our teenager
downloaded one stupid song (that we hate because everyone knows no-one has
written decent music since the '80s, right?).
<p>
<a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.theaustralian.com.au/australian-it/iinet-wins-landmark-copyright-case/story-e6frgakx-1226334090530"
title="The Australian"><i>The Australian</i></a> has a good summary of the decision and
its history.  My previous posts on this topic are <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=iinetandpiracy%21201002050932%21internet%2Claw"
title="iiNet &amp; AFACT Piracy Case">here</a> and especially <a href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=iinetandpiracypt2%21201102250916%21internet%2Claw" title="iiNet &amp; Piracy Pt. 2">here</a>.
<p>
<hr>
<p>
For the record again, my position is the common one: <div class="image"><img
src="http://www.nickcoleman.org/blog/images/failedbusinesscleft.jpg" alt="your failed business model is not my
problem" ></div>I don't condone piracy, but the studios have no right to force a third
party to give them my personal details.  Let's not forget the studios are private
corporations.  They are not the police.  If they want my details, a court order will
give it to them.  Of course, the burden of proof then falls on them, a complication they
do not want to deal with despite any concepts of fairness or due process.  
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=iinet-piracy-high-court%21201204200844%21internet%2Claw/feed/</wfw:commentRss>
</item>
<item>
<title>Google Search Verbatim Instead Of + (plus)
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=google-verbatim%21201203281201%21internet</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=google-verbatim%21201203281201%21internet#comments</comments>
<pubDate>Wed, 28 Mar 2012 12:01:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>internet</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=google-verbatim%21201203281201%21internet/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
You probably know that Google turned off the <strong>+</strong> modifier for searches.  A right
royal pain in the backside for those of us who know exactly how to search and don't want
the search engine to try and second guess us &mdash; it inevitably gets it wrong and
returns a bunch of results we are not interested in.
<p>
The word from Google at the time was that we could use <strong>"..."</strong> instead and it
would work.  Well, it didn't and doesn't: Google still corrects words and makes
substitutions within the quotes and defeats our attempt to narrow down the search.
<p>
It turns out that Google has added a new search parameter: <strong>verbatim</strong>.
Select <em>More search tools</em> −> <em>Verbatim</em> from the menu at the left.  It is a two
step process instead of just being able to use <strong>+</strong> in front of a keyword,
but it is better than before.
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=google-verbatim%21201203281201%21internet/feed/</wfw:commentRss>
</item>
<item>
<title>How To Set Up A Light-Weight On-line Thesaurus For Vim Pt.II
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix#comments</comments>
<pubDate>Sat, 25 Feb 2012 09:00:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>blogging</category>
<category>internet</category>
<category>programming</category>
<category>software</category>
<category>unix</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
<div class="summary">Vim has support for a built-in thesaurus. However, it consumes
memory and its auto-complete selection has issues. In Part I, I showed how to set up an
on-line thesaurus.   Here is how to build syntax rules that will colour the
output.</div>
<p>
<h2>Summary</h2>
<p>
This is the second post of two about a light weight way to implement a thesaurus.  In
<a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus%21201202170802%21general%2Cblogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix">
Part I</a>, I described how to set up a script that provides access to an on-line thesaurus.
In this Part, I describe how to write a set of simple syntax rules to provide colour and
highlighting for the output.
<p>
<a name="screenshot"></a>
Here is a screenshot of the finished syntax rules (using dummy data):
<div class="image">
<img src="http://www.nickcoleman.org/blog/images/thes-syntax.jpg" title="Syntax highlighting example" alt="syntax
highlighting screenshot" >
</div>
<p>
If you don't like these colours, you can change the rules to use whatever colours you prefer.
<p>
<hr>
<h2>How Vim does it</h2>
<p>
Vim does syntax highlighting in two parts:
<ul><li>A set of rules that define a (group of) word, and</li>
<li>A colour description for each rule.</li></ul>
<p>
<h4>Set of rules</h4>
The set of rules is in a file named for the filetype in either
<code>$VIMRUNTIME/syntax</code> or your <code>~/.vim/syntax</code>.  Each rule defines a
name for that rule and a list or a regular expression to match words that should be
captured by that rule.
<p>
An example is 
<div class="code">
syntax keyword     cStatement	goto break return continue asm
syntax keyword     cLabel	case default
syntax match	   cTodo        /\/* *TODO/
syntax region	   cBlock       start="{" end="}" 
</div>
The first and second lines are the simplest case of <strong>keyword</strong>, which is a
list of words.  The third line is the next simplest case of a regex <strong>pattern
match</strong>, in this case matching "TODO" within a C-style comment leading with "/*".
The fourth line is like a broader <strong>match</strong> in that it specifies a
<strong>region</strong> defined by a start pattern and an end pattern.
<p>
Notice each rule has a name that is a unique identifier: <strong>cStatement</strong>,
<strong>cLabel</strong>, <strong>cTodo</strong> and <strong>cBlock</strong>.
<p>
In fact, it can be much more complicated than that.  Rules can be contained within rules
(for example, nested <code>if-then</code>s), rules can apply only if another rule has
triggered (a '(' as part of a condition, but not as part of a comment), rules can apply
to entire regions, and so on.
<p>
However, for our simple purpose which is to provide slight colour hints to a plain text
list, we can ignore all that, to which we say, "Thank goodness."  Check out sh.vim or
even simple old c.vim  to see why.
<p>
<h4>Colour description</h4>
The colour description is usually in the <code>colorscheme</code> you are using.  For
each rule name, it provides a colour scheme to use, along the lines of 
<p>
<div class="code">
hi Comment term=bold ctermfg=8 guifg=#7C7C7C
</div>
<p>
which translates as: for rule name "Comment", for a simple terminal set to bold, for a colour
terminal set the foreground to colour number 8, for the GUI version of Vim set the
foreground to colour number #7C7C7C.
<p>
You can think of the process as going like this: when Vim displays a file, each
token or word is checked against the set of rules and assigned a rule name, and is then
colourised according to the highlighting for that rule name.
<p>
You can see your highlighting description with <code>:hi Comment</code>, which shows the
current highlighting for rule "Comment" for the filetype in the current buffer.  To see all
highlighting, use <code>:hi</code> by itself.
<p>
From this you can see that the rules and the highlighting scheme are tightly coupled.
Each rule has a name, and that name should be in the colorscheme<sup><small><a href="#fn1" id="back1">1</a></small></sup> .  Vim controls the
tight coupling by providing a set of standard names which, if we were writing rules for
a new programming language, we should use.  Those names are things like "Comment",
"Keyword", "Statement", and so on.
<p>
However, we are creating our own set of rules for this specific instance of text, so we
can do what we like.  Which we shall.
<p>
<h2>Syntax rules</h2>
<p>
<h4>The data</h4>
Here is the (made up) sample text we will be working with:
<p>
<div class="code">
   Main Entry:     which pronunciation [hwich, wich] [IMG] Show IPA/<h>wItS,
   Definition:     what
   Synonyms:       and that, that, whatever, whichever
		   in order that, in that, so, so that
   Notes:          in current usage, that refers to persons or things and
                   which is used chiefly for things. The standard rule says
                   that one uses that only to introduce a restrictive or
   Antonyms:       none
<p>
   Main Entry:     this
   Definition:     the one
   Synonyms:       that, the aforementioned one, the one in question, the
                   thing indicated, this one, this person
<p>
</div>
<p>
Typically, you get a set of entries comprising Main Entry, Definition, Synonyms, Notes
and Antonyms.  Not every entry is present, often Antonyms and Notes are absent.
<p>
As well, if the word is common, you often get repeating sets of entries for very similar
words, as in the example where the entry for <b>this</b> is also shown.  (This isn't
real data, by the way; <b>this</b> doesn't appear with <b>which</b> normally.)
<p>
<h4>Set up some rules</h4>
How to define some rules?  Some things come to mind immediately.
<ul><li>Entry names distinguished (highlighted differently) from entry contents, i.e.
"Main Entry:" should be different from "which".</li>
<li>The main entry word should be bolded or similar to make it stand out.</li>
<li>The main word melds too easily into "pronunciation..", which makes it hard to pick out.
Diminish everything to the right of the main word.</li>
<li>Each entry should be distinguished so you can scan them easily.  For example, "Main
Entry" should stand out, "Definition" less so, "Synonyms" more so.</li></ul>
<p>
Here is an example rule set that roughly implements the first, second and fourth items.
<p>
<div class="pygmentize"><pre><span class="c">&quot; Include the colon as part of the word</span><br><span class="k">setlocal</span> <span class="nb">iskeyword</span><span class="p">+=</span>:<br><br><span class="c">&quot; Rules</span><br><span class="c">&quot; a keyword</span><br><span class="nb">syntax</span> keyword  thesSynonyms Synonyms:<br><span class="c">&quot; this entry name has a space, so needs a regex </span><br><span class="nb">syntax</span> <span class="k">match</span>  thesMainEntry <span class="sr">/Main Entry: */</span> <br><span class="c">&quot; this entry should include the line to the end</span><br><span class="nb">syntax</span> region thesDefinition <span class="k">start</span><span class="p">=</span><span class="sr">/Definition: /</span>  <span class="k">end</span><span class="p">=</span><span class="sr">/$/</span><br><br><span class="c">&quot; Highlighting</span><br><span class="c">&quot; link the highlighting to the defined </span><br><span class="c">&quot; name &quot;Keyword&quot; in the colorscheme</span><br><span class="nb">hi</span> link thesMainEntry Keyword<br><span class="nb">hi</span> link thesSynonyms  Statement<br><span class="nb">hi</span> link thesDefinition  Todo<br><span class="c">&quot; specify the colours directly</span><br><span class="nb">hi</span>  thesMainEntry <span class="nb">term</span><span class="p">=</span><span class="nb">bold</span> <br> \ctermfg<span class="p">=</span>White cterm<span class="p">=</span><span class="nb">bold</span> guifg<span class="p">=</span><span class="m">6</span> gui<span class="p">=</span><span class="nb">bold</span><br><span class="c">&quot; or, keep the current colour, but bold it</span><br><span class="nb">hi</span>  thesMainEntry <span class="nb">term</span><span class="p">=</span><span class="nb">bold</span> cterm<span class="p">=</span><span class="nb">bold</span> gui<span class="p">=</span><span class="nb">bold</span><br></pre></div>
<p>
which gives this:
<div class="image">
<img src="http://www.nickcoleman.org/blog/images/thes-syntax-example.jpg" title="Syntax example rule set"
alt="syntax example" >
</div>
<p>
You can see that the keyword "Synonyms:" has matched a rule set, and the highlighting
for the rule "Statement" in my colorscheme (likely different to your colorscheme) has
been applied.
<p>
The regex pattern for "Main Entry: " has matched a rule set and the highlighting for the
rule set has been applied.
<p>
Similarly for "Definition", the rule set has matched to the end of the line, and the
highlighting for the rule "Todo" in my colorscheme  has been applied.
<p>
<h2>The real deal</h2>
<p>
<h4>Rules</h4>
Enough examples. Here is the actual syntax file to produce the highlighting in the <a
href="#screenshot">screenshot at the top of the page.</a>
<p>
<div class="pygmentize"><pre><span class="c">&quot; Vi syntax file</span><br><span class="c">&quot; Language: text dump from online thesaurus</span><br><span class="c">&quot; Maintainer: Nick Coleman</span><br><span class="c">&quot; Last Change:  2012 Feb 18</span><br><span class="c">&quot; Remark: for the online thesaurus script by Nick Coleman</span><br>  <br><span class="k">if</span> exists<span class="p">(</span><span class="s2">&quot;b:current_syntax&quot;</span><span class="p">)</span><br>    <span class="k">finish</span><br><span class="k">endif</span><br><br><span class="c">&quot; Setup</span><br><span class="c">&quot; syntax clear    &quot; only useful for testing</span><br><span class="nb">syntax</span> case <span class="k">match</span><br><span class="k">setlocal</span> <span class="nb">iskeyword</span><span class="p">+=</span>:<br><br><span class="c">&quot; Entry name rules</span><br><span class="nb">syntax</span> <span class="k">match</span> thesMainEntry <span class="sr">/Main Entry: */</span> contained<br><span class="nb">syntax</span> region  thesDefinition <span class="k">start</span><span class="p">=</span><span class="sr">/Definition: /</span>  <span class="k">end</span><span class="p">=</span><span class="sr">/$/</span><br><span class="nb">syntax</span> keyword thesNotes Notes: contained<br><span class="nb">syntax</span> keyword thesSynonyms Synonyms:<br><span class="nb">syntax</span> region  thesAntonyms <span class="k">start</span><span class="p">=</span><span class="sr">/Antonyms:/</span>  <span class="k">end</span><span class="p">=</span><span class="sr">/$/</span> <br><br><span class="c">&quot; give the pronunciation region a special name</span><br><span class="nb">syntax</span> region thesPronunciation <span class="k">start</span><span class="p">=</span><span class="sr">/pronunciation \[/</span> <span class="k">end</span><span class="p">=</span><span class="sr">/$/</span> contained<br><br><span class="c">&quot; Entry contents rules</span><br><span class="nb">syntax</span> region thesMainWord <span class="k">start</span><span class="p">=</span><span class="sr">/Main Entry:/</span>  <span class="k">end</span><span class="p">=</span><span class="sr">/$/</span> contains<span class="p">=</span>CONTAINED keepend<br><span class="nb">syntax</span> region thesNotesEntry <span class="k">start</span><span class="p">=</span><span class="sr">/Notes:/</span>  <span class="k">end</span><span class="p">=</span><span class="sr">/^ *$/</span> contains<span class="p">=</span>thesNotes<span class="p">,</span>thesAntonyms keepend <br><br><span class="c">&quot; Highlighting</span><br><br><span class="nb">hi</span> link thesMainEntry     Keyword<br><span class="nb">hi</span>  thesMainWord      <span class="nb">term</span><span class="p">=</span><span class="nb">bold</span> cterm<span class="p">=</span><span class="nb">bold</span> gui<span class="p">=</span><span class="nb">bold</span><br><span class="nb">hi</span> link thesDefinition      String<br><span class="nb">hi</span> link thesNotes     Number<br><span class="nb">hi</span> link thesNotesEntry      Number<br><span class="nb">hi</span> link thesSynonyms      Statement<br><span class="nb">hi</span> link thesAntonyms      Todo<br><span class="nb">hi</span> link thesPronunciation   Comment<br><br><span class="k">let</span> <span class="k">b</span>:current_syntax <span class="p">=</span> <span class="c">&quot;thesaurus&quot;</span><br></pre></div>
<p>
Some rules have "contains".  This allows a rule within a rule, the classic example being
Todo appearing within a comment where you want a different colour to make Todo stand
out.  "keepend" is part of that, it stops both rules at the first end pattern match rather than
the final end match.  <code>:h usr_44</code> section 44.5 for more.
<p>
<h4>Setup</h4>
<p>
Recall from Part I that the script sets the thesaurus' buffer to filetype thesaurus.
Put the above syntax file in <code>$HOME/.vim/syntax/thesaurus.vim</code> and the buffer will pick up the
syntax rules automatically. 
<p>
Windows users can put it in <code>C:\Program Files\Vim\vimfiles\syntax\thesaurus.vim</code> or the
equivalent if using Vista or Windows 7.  If you don't have administrator privileges,
find where Vim thinks $HOME is by (within Vim) trying <code>:echo $HOME</code> or <code>:version</code> and
putting it in $HOME\vimfiles\syntax\thesaurus.vim.
<p>
<h2>Trying it out</h2>
<p>
You probably want to use your own highlighting.  A tip: to easily see the
effect of your changes reload the highlighting for the data buffer with <code>:setlocal filetype=thesaurus</code>.
<p>
To see the colours that your colorscheme uses for a particular rule use <code>:hi
&lt;rule&gt;</code> as in <code>:hi Statement</code>.  To see all colours, use hi by itself <code>:hi</code>.
<p>
<div class="footnote">
<a href="#back1" id="fn1">[1]</a> I said the rule should be in the colorscheme.  In
fact, it is not an error if there is no colour highlighting for that rule, it simply
gets ignored. And you can put highlighting descriptions anywhere, such as your
<code>.vimrc</code>. For example, I quite like the colorscheme I use, except for the Search colours
which I override with a separate description in my <code>.vimrc</code> like this:
<pre><code>
hi Search ctermfg=white ctermbg=darkblue
</code></pre> .<a href="#back1"> &uarr; </a>
</div>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix/feed/</wfw:commentRss>
</item>
<item>
<title>How To Set Up A Light-Weight On-line Thesaurus For Vim
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus%21201202170802%21general%2Cblogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus%21201202170802%21general%2Cblogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix#comments</comments>
<pubDate>Fri, 17 Feb 2012 08:02:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>general</category>
<category>blogging</category>
<category>internet</category>
<category>programming</category>
<category>software</category>
<category>unix</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus%21201202170802%21general%2Cblogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
<div class="summary">Vim has support for a built-in thesaurus. However, it consumes a
lot of memory, which you may not want for a feature you do not use much, and its
auto-complete selection has issues.  Here is how to set up an on-line thesaurus query that is light weight.</div>
<p>
<h2>Summary</h2>
<p>
This is the first post of two (<a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix">second
here</a>) about a light weight way to implement a thesaurus.  It is
great for what I need, which is the occasional use of a thesaurus for writing text such
as this article.  Once it is set up, you can forget about it and just use <b>K</b>
whenever you want to look up a word.
<p>
A nice bonus or synergy of using an online source is that the website also returns a definition for the word, so
it functions as a simple dictionary as well.
<p>
The second post (<a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus-syntax%21201202250900%21blogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix">here</a>) will deal with how to use Vim's built in syntax rule sets to provide
highlighting and nice colours.
<p>
There are two or three simple steps:
<ol><li>a vim script that passes the cursor word to an external shell script.</li>
<li>a shell script that looks up the word using an online thesaurus, then parses
the output to remove unnecessary cruft.</li>
<li>an optional syntax file to provide highlighting and colours.</li></ol>
<p>
Here is what it looks like with a quick-n-dirty syntax rule set:
<div class="image">
<img src="http://www.nickcoleman.org/blog/images/thesaurus.jpg" title="Vim thesaurus"
alt="vim with thesaurus window" >
<p>Sample output for "quick"
</div>
<p>
<hr>
<p>
I use Vim for programming and it is great for that.  I use it when I need to edit files on remote
servers that I have ssh'ed in.  I also use Vim for writing text because it is a great
text editor.  I am writing this in Vim, for example. 
<p>
Often, one of the things you want when writing text is a thesaurus, which is the topic
for today.
<p>
<h2>Why on-line</h2>
Vim comes with support for a thesaurus, but I've never really liked it, for two reasons.  
<p>
You download a thesaurus (the Moby one is common), point Vim to it and you are good to
go. However, a thesaurus can consume a lot of memory.  The Moby file is 24 MB, although to
be fair Vim doesn't use anything like that.  If you are using Vim remotely, you may not
want to use <em>any</em> extra memory. For example, I have a remote VPS that has only
128 MB of total memory, so every megabyte is critical.  This is especially true if you
use the thesaurus only rarely; the trade-off is not worth it.
<p>
The bigger issue for me is that Vim does not handle a long list of alternative words very well.
It is fine with a short list where you use it just like auto-complete, but it seems to
get confused with a long list.  Unfortunately, Moby often serves up a long list, and you
frequently find you have inadvertently changed your perfectly good word to something
completely different.  You then have to Undo and try again.
<p>
In fact, I got so frustrated with the Vim/Moby combination I ended up unsetting the
thesaurus feature in Vim after just a couple of months.
<p>
The good news is that there is an alternative, and that is to use one of the many
on-line thesauruses. 
<p>
I have set things up so that a script calls a command-line browser to query for a word
and dump the output from the online website, then parses it to remove extra cruft such
as sidebars, headers and footers, and white space, and puts it into a scratch buffer in
Vim itself for me to browse through and perhaps copy a word from.
<p>
<h2>Vim setup</h2>
First, set up Vim to call a script, which I will describe further down, and display the
results in a scratch buffer.  (I shamelessly pulled the concept straight from the ReadMan
script, which displays a Unix man page for the keyword over the cursor.) To use it, simply
press <b>K</b> with the cursor under the word that you want to see alternatives.  You
can do all the normal things in the buffer like search, jump to a line, copy a word
(<b>cw</b>), and so on. To close it, just hit the <b>q</b> key.  
<p>
Copy and paste the following into your <code>.vimrc</code> or wherever you prefer.  I
have it in a common.vim file in my ftplugin directory, where my various filetype scripts
can source it if they want.
<p>
<div class="pygmentize"><pre><span class="k">fun</span><span class="p">!</span> ReadThesaurus<span class="p">()</span><br><span class="c">   &quot; Assign current word under cursor to a script variable</span><br>   <span class="k">let</span> s:thes_word <span class="p">=</span> expand<span class="p">(</span><span class="s1">&#39;&lt;cword&gt;&#39;</span><span class="p">)</span><br><span class="c">   &quot; Open a new window, keep the alternate so this doesn&#39;t clobber it. </span><br>   <span class="k">keepalt split </span>thes_<br><span class="c">   &quot; Show cursor word in status line</span><br>   <span class="k">exe</span> &quot;<span class="k">setlocal </span><span class="nb">statusline</span><span class="p">=</span>&quot; . <span class="s1">s:thes_word</span><br><span class="c">   &quot; Set buffer options for scratch buffer</span><br>   <span class="k">setlocal</span> <span class="nb">noswapfile</span> <span class="nb">nobuflisted</span> <span class="nb">nowrap</span> <span class="nb">nospell</span> <br>     \<span class="nb">buftype</span><span class="p">=</span>nofile <span class="nb">bufhidden</span><span class="p">=</span>hide <br><span class="c">   &quot; Delete existing content</span><br>   <span class="m">1</span><span class="p">,</span>$<span class="k">d</span><br><span class="c">   &quot; Run the thesaurus script</span><br>   <span class="k">exe </span><span class="s2">&quot;:0r !/home/nickcoleman/bin/thesaurus &quot;</span> . s:thes_word <br><span class="c">   &quot; Goto first line</span><br>   <span class="m">1</span><br><span class="c">   &quot; Set file type to &#39;thesaurus&#39;</span><br>   <span class="k">set</span> <span class="k">filetype</span><span class="p">=</span><span class="nb">thesaurus</span><br><span class="c">   &quot; Map q to quit without confirm</span><br>   <span class="k">nmap </span><span class="p">&lt;</span><span class="nb">buffer</span><span class="p">&gt;</span> <span class="k">q</span> :<span class="k">q</span><span class="p">&lt;</span>CR<span class="p">&gt;</span><br><span class="k">endfun</span><br><span class="c">&quot; Map the K key to the ReadThesaurus function</span><br><span class="nb">noremap</span> <span class="p">&lt;</span><span class="nb">buffer</span><span class="p">&gt;</span> K :<span class="k">call</span> ReadThesaurus<span class="p">()&lt;</span>CR<span class="p">&gt;&lt;</span>CR<span class="p">&gt;</span><br></pre></div>
<p>
The script is fully commented for you to follow what is going on.
<p>
Notice I specified the location of the shell script that Vim is calling to be
<code>$HOME/bin/thesaurus</code>.  Change this to wherever you are going to put the
shell script.
<p>
I call the scratch buffer "thes_" so that I have a name for it which means I can
easily re-use the buffer again.  Otherwise Vim would create a new buffer every time and
go through buffer numbers like crazy.  In the unlikely event you have an actual file
called "thes_", change the script to use some other wacky name.  I could have used the
Vim function <code>tempname()</code> to generate a unique buffer name, but "thes_" is
meaningful and easy to remember if you want to unhide the buffer later.
<p>
I set the status line to show the word I am looking up.  Sometimes the website will
return a different word, a near synonym, so the status line reminds me exactly which
word I am looking up. An example is looking up the word "the" which returns the
definition and synonyms for "histrionical".  No, I don't know why.
<p>
I include a line to set the filetype to "thesaurus". Its purpose is to allow me to set
up anything special for that filetype later on.  A possible use would be to create a
syntax file to do special highlighting or colouring.  That will be the topic for the
next post.
<p>
The mapping for K has a second <code>&lt;CR&gt;</code>.  It eats up the "Press ENTER or type
command to continue" prompt.  There are other ways to do this, but they can mess up the
display or have the side effect of applying globally instead of just this buffer.
<p>
By the way, in case it isn't obvious, the reason I split the functionality in to
two&mdash;part- Vim and part-shell scripts&mdash;instead of doing it all in Vim is
because Vim's scripting language, always an awkward beast at the best of times, would
make it too hard.  Best to use the good parts of each and combine them.
<p>
<h2>Thesaurus script</h2> Up to now, everything has been operating system agnostic.  The
script below is written for unix-like operating systems, including OS X, because it uses
some unix utilities like links and sed.  (The others it uses such as basename and
readlink aren't absolutely necessary, just good style.) I have written a <a
href="#windows">paragraph or two below </a> on how to get those tools for Windows.
<p>
Originally I wrote a quick-n-dirty one-liner, but I decided the script might be useful
outside Vim as well, so I tidied it up to be a useful script that you can call from the
command line.
<p>
This is the shell script that Vim calls.  Put it in the location you specified in the
Vim script above.  It is straightforward, apart from the <code>sed</code> call, and there
are enough comments so you can see what is happening.
<p>
<div class="pygmentize"><pre><span class="c">#!/bin/sh</span><br><br><span class="c"># This searchs an online thesaurus, cuts as much cruft out as possible,</span><br><span class="c"># and displays the definitions.</span><br><br><span class="c">#URL=&#39;http://www.merriam-webster.com/thesaurus/&#39;</span><br><span class="nv">URL</span><span class="o">=</span><span class="s1">&#39;http://thesaurus.com/browse/&#39;</span><br><br><span class="c"># Display help, and quit.</span><br><span class="k">function </span>usage <span class="o">{</span><br>    cat <span class="s">&lt;&lt; EOF_HELP</span><br><span class="s">    Usage: $(basename $0) [-r][-h] word</span><br><span class="s">    Display the parsed results of an online thesaurus search for &lt;word&gt;</span><br><br><span class="s">    -r  Raw results; no filtering.</span><br><span class="s">    -h  Display this help.</span><br><br><span class="s">EOF_HELP</span><br>    <span class="nb">exit</span> <br><span class="o">}</span><br><br><span class="c"># Check for a parameter.  Get the output from the website</span><br><span class="k">function </span>get_thes <span class="o">{</span><br>    <span class="o">[</span>  -z <span class="s2">&quot;$1&quot;</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> usage<br>    <span class="k">$(</span>readlink -e <span class="k">$(</span>which links<span class="k">))</span> -dump <span class="s2">&quot;${URL}$1&quot;</span> <br><span class="o">}</span><br><br><span class="c"># Check for a -h parameter or absence of parameter.</span><br><span class="o">[</span> <span class="s2">&quot;$1&quot;</span> <span class="o">=</span> <span class="s2">&quot;-h&quot;</span> -o -z <span class="s2">&quot;$1&quot;</span> <span class="o">]</span> <span class="o">&amp;&amp;</span> usage<br><br><span class="c"># If -r just get the raw output, otherwise pass it through sed.</span><br><span class="k">if</span> <span class="o">[</span> <span class="s2">&quot;$1&quot;</span> <span class="o">=</span> <span class="s2">&quot;-r&quot;</span> <span class="o">]</span> ; <span class="k">then</span><br><span class="k">    </span><span class="nb">shift</span><br><span class="nb">    </span>get_thes <span class="nv">$1</span><br><span class="k">else</span><br>    <span class="c"># Sed is doing many things: </span><br>    <span class="c"># Print out &quot;no results found for &quot; and quit. </span><br>    <span class="c"># Print out only the lines I&#39;m interested in, which are:</span><br>    <span class="c">#   No results found for</span><br>    <span class="c">#   Main Entry:</span><br>    <span class="c">#   Definition:</span><br>    <span class="c">#   and all from Synonyms: to Antonyms:</span><br>    <span class="c"># Put a new line after Antonyms.</span><br>    get_thes <span class="nv">$1</span> | sed -n -e <span class="s1">&#39;/No results found for/ {&#39;</span> <span class="se">\</span><br>      -e <span class="s1">&#39; s|^[ \t]*\(No results .*\)|\1|p ; q} &#39;</span><span class="se">\</span><br>      -e <span class="s1">&#39;{/Main Entry:/p &#39;</span> <span class="se">\</span><br>      -e <span class="s1">&#39;/Definition:/p &#39;</span> <span class="se">\</span><br>      -e <span class="s1">&#39;/Synonyms:/,/\(Antonyms:\)\|\(^$\)/p} &#39;</span> <span class="se">\</span><br>      -e <span class="s1">&#39;/Antonyms:/ a \ &#39;</span><br><span class="k">fi</span>    <br></pre></div>
<p>
I use <b>thesaurus.com</b> because I found it has a wider coverage than
<b>merriam-webster.com</b>.  If you want to use a different website, you will need to
write your own sed script to parse it.
<p>
I separated the sed actions into discrete chunks above so you can get a clearer picture
of what sed is doing, but they appear as one line in the actual script.  That line is
below, for you to cut-n-paste.
<p>
<div class="pygmentize"><pre>    get_thes <span class="nv">$1</span> | sed -n -e <span class="s1">&#39;/No results found for/ { s|^[ \t]*\(No results .*\)|\1|p ; q} ;{/Main Entry:/p ; /Definition:/p ; /Synonyms:/,/\(Antonyms:\)\|\(^$\)/p} ; /Antonyms:/ a \  &#39;</span>
</pre></div>
<p>
<h2>Install</h2>
<p>
I prefer that this script is only available in buffers where I am writing text.  That
way, I can keep the <b>K</b> mapping for unix man pages in buffers where I am writing code.
<p>
The way to do that is to put the Vim script in its own file that is called only by
text-like filetypes.  Scripts that are unique to filetypes are put in <code>~/.vim/ftplugin</code> or
<code>~/.vim/after/ftplugin</code>.  I created files named text.vim, xml.vim and blog.vim that are in
the ftplugin directory.  (XML is treated as text because most of my XML is text data.)
I put the thesaurus script in a file called common.vim and I <code>source
common.vim</code> from within all the above ftplugin scripts.  
<p>
If ftplugin (and perhaps autocmd) are a mystery, see <code>:h ftplugin</code> and <code>:h autocmd</code>.
<p>
With that, I am done.  To use, move the cursor in Normal mode underneath the word of interest
and press <b>K</b>.
<p>
<h3>Bugs</h3>
<p>
I noticed once that the script loses permissions to a temporary file that Vim uses
internally.  It seems to only happen if a remote ssh session in screen is unexpectedly
terminated, and not always then.  Closing and re-opening Vim (yes, a pain) will fix it.
<p>
<a name="windows"></a>
<p>
<h2>Additional Info for Windows Users</h2>
<p>
<h4>Summary</h4>
In summary, Windows users have a little extra to do, but not much.
<ol><li>Install <code>sed</code> and <code>links</code>. It takes only a few seconds
each to download and install.  The default install is fine.</li>
<li>Create a batch file to get and parse the thesaurus data. The batch file will be an
abbreviated form of the shell script above, tailored for Windows.</li>
<li>Point vim to the location of the batch file.</li></ol>
<p>
<h4>Install Links &amp; Sed</h4>
Windows users will need to install the <code>links</code> and <code>sed</code>
commands and perhaps a shell if you don't want to use a DOS batch file (which is below).
I did a bit of a search and found a  <a
href="http://www.nickcoleman.org/axs/ax.pl?http://links.twibright.com/download/binaries/win32/">Windows
build of <code>links</code> here</a>, and a good set of  <a
href="http://www.nickcoleman.org/axs/ax.pl?http://gnuwin32.sourceforge.net/packages.html">unix
tools for Windows here</a> (direct link for sed is <a
href="http://www.nickcoleman.org/axs/ax.pl?http://gnuwin32.sourceforge.net/packages/sed.htm">
here</a>), both of which are fine.
<p>
<h4>Create DOS Batch File</h4>
You don't need the full-blown shell script above.  The simple DOS batch file below will do. It
runs links and then pipes the output to sed.  You might need to change the paths to the
folder(s) where you installed links and sed if you did not use the defaults when you
installed them.
<p>
<div class="pygmentize"><pre>@"c:\program files\links\links.exe" -dump http://thesaurus.com/browse/%1 | "c:\program files\gnuwin32\bin\sed.exe" -n -e <span class="s1">&quot;/No results found for/ { s|^[ \t]*\(No results .*\)|\1|p ; q} ;{/Main Entry:/p ; /Definition:/p ; /Synonyms:/,/\(Antonyms:\)\|\(^$\)/p} ; /Antonyms:/ a \  &quot;</span><br></pre></div>
<p>
There are a couple of differences to the unix script.  Sed needs double quotes
surrounding its commands instead of single quotes.  The paths to the executables need
double quotes because of the spaces in the path.  I put a '@' in front of everything
to prevent DOS from echoing back the entire command. Finally, I put <code>%1</code>
instead of <code>$1</code> for the DOS way to pass in the parameter.
<p>
It is probably worth opening a <code>cmd.exe</code> window and testing the batch file.  Assuming you
called the batch file "thesaurus.bat", run <code>thesaurus.bat loose</code> and you
should get a listing back after a few seconds with all the synonyms of "loose".
<p>
<h4>Point Vim</h4>
Now that you have a batch file that works, put it in a folder somewhere. The vim script
needs one small change to point to that location.  
<p>
In Windows, $HOME expands to "C:\Documents and Settings\{user}".<a href="#fn1"
id="back1"><sup>1</sup></a>  However, it doesn't
expand in a script if it is contained within quotes.  So the vim script should be
changed from 
<p>
<div class="code">
" Run the thesaurus script
   :exe ":0r !$HOME/bin/thesaurus " . s:thes_word 
</div>
<p>
to 
<p>
<div class="code">
" Run the thesaurus script
   :exe ':0r !"' . $HOME . '\thesaurus.bat" ' . s:thes_word
</div>
assuming again that your DOS batch file is called "thesaurus.bat".  This puts
$HOME outside the quotes and Vim will expand it to "C:\Documents and
Settings\{user}\thesaurus.bat"<a href="#fn1"
id="back1"><sup>1</sup></a>.  
<p>
If you put the batch file in a sub folder, add that folder name in front of
<code>'\thesaurus.bat'</code> like this: <code>'\my_folder\thesaurus.bat'</code>.
<p>
If it does not work, because you tested the batch file itself beforehand, the problem is
almost certainly in how you specified the location of your batch file in the vim script.
<p>
<hr>
<p>
<div class="footnote">
<a href="#back1" id="fn1">[1] for Windows XP.  Vista expands to "C:\Users\{user}".  Windows
7 the same as Vista? -- I don't know, I don't have it. </a> <a href="#back1">&#8593;</a>
</div>
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=vim-thesaurus%21201202170802%21general%2Cblogging%2Cinternet%2Cprogramming%2Csoftware%2Cunix/feed/</wfw:commentRss>
</item>
<item>
<title>Get Google-chrome's Version (2)
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion-2%21201111211027%21unix%2Cinternet</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion-2%21201111211027%21unix%2Cinternet#comments</comments>
<pubDate>Mon, 21 Nov 2011 10:27:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>unix</category>
<category>internet</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=chromeversion-2%21201111211027%21unix%2Cinternet/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
I <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi?post=chromeversion%21201111090808%21unix%2Cinternet">wrote
previously</a> about a small script to extract the Google-chrome's version number from
its deb file.  To recap, Google provides either a deb (a deb file is a Debian package)  or an rpm
(for Redhat and its kin, such as Ubuntu) file only.  Other unixes and Linux
distributions need to create their own packages, and Slackware provides a script to pull the deb
file apart and repackage it into a Slackware package. 
<p>
I wrote a small script based on the Slackware script that extracts the google-chrome
version number from the deb file so you can see if you have the latest version or not.
(None of this would be necessary if Google would name their deb file to include the
version number.  I don't know why they don't do this.)
<p>
I noticed the script is not particularly efficient so I rewrote the section that
actually gets the version number.  Here it is (first the original, then the update):
<p>
<div class="pygmentize"><pre> <span class="c"># The original</span>
<span class="nv">VERSION</span><span class="o">=</span><span class="k">$(</span>ar p <span class="s2">&quot;$FILE&quot;</span> control.tar.gz 2&gt; /dev/null | tar zxO ./control 2&gt; /dev/null <span class="se">\</span>
	| grep Version | awk <span class="s1">&#39;{print $2}&#39;</span> | cut -d- -f1<span class="k">)</span>
 <span class="c"># A better version that does everything in awk</span>
<span class="nv">VERSION</span><span class="o">=</span><span class="k">$(</span>ar p <span class="s2">&quot;$FILE&quot;</span> control.tar.gz 2&gt; /dev/null | tar zxO ./control 2&gt; /dev/null <span class="se">\</span>
	| awk -F <span class="s2">&quot;[ :-]&quot;</span> <span class="s1">&#39;/Version/ {print $3}&#39;</span><span class="k">)</span>
</pre></div>
<p>
The difference is that the second version uses awk for everything instead of using grep
then awk then cut, so it avoids two extra pipes and sub-shells.
<p>
It is small beer, but, since some of these scripts are by their nature tutorials, we
should probably use the better versions.  Without meaning to disparage the original
authors, using grep then awk is a classic beginner's mistake and shouldn't serve as
an example of best practice.
<p>
<h3>Details</h3>
<p>
This is where I go into detail.  Skip it if you are not interested.
<p>
<div class="pygmentize"><pre>ar p <span class="s2">&quot;$FILE&quot;</span> control.tar.gz 2&gt; /dev/null | 
	tar zxO ./control 2&gt; /dev/null
</pre></div>
produces the following output:
<p>
<div class="code">
Package: google-chrome-stable
Version: 15.0.874.121-r109964
Architecture: i386
Maintainer: Chrome Linux Team <chromium-dev@chromium.org>
Installed-Size: 104068
Pre-Depends: dpkg (>= 1.14.0)
Depends: libasound2 (>> 1.0.22), libbz2-1.0, libc6 (>= 2.11), libcairo2 (>= 1.6.0),
...
...
...
Description: The web browser from Google
 Google Chrome is a browser that combines a minimal design with sophisticated technology
to make the web faster, safer, and easier.
</div>
The line we are interested in is the second line, <code>Version: 15.0.874.121-r109964</code>.  We don't
need to grep for that line, awk can get it itself with <code>/Version/</code>. 
<p>
We also don't need to cut out the version number, we can separate the line into fields
in awk and print out the relevant field. Looking at the line, the field we want is
preceded by a space and succeeded by a dash.  We use the command line option <code>-F
"[&nbsp;&ndash;]"</code> which gives awk a regular expression to use as field separators.  That is,
anything separated by a space or a dash is a field. Then print the second
field (<code>{print $2}</code>). 
<p>
The result of this is <code>15.0.874.121</code>.
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion-2%21201111211027%21unix%2Cinternet/feed/</wfw:commentRss>
</item>
<item>
<title>Get Google-Chrome's Version from Deb File Download
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion%21201111090808%21unix%2Cinternet</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion%21201111090808%21unix%2Cinternet#comments</comments>
<pubDate>Wed, 09 Nov 2011 08:08:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>unix</category>
<category>internet</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=chromeversion%21201111090808%21unix%2Cinternet/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
Here is a quick script to pull out google-chrome's version number from the deb file that
you download from Google's website. I call it <code>getversion.sh</code>.  I find it
useful when you want to see if the google-chrome version is newer than the one you have
installed on your Slackware system without having to go through the whole build process.
<p>
The method is pulled from Slackware's build script which itself thanks Fred Richards. 
<p>
To use it: <code>./getversion.sh [ -h | filename ] </code> If you don't provide a
filename, it will look in the current directory for the default-named stable deb file.
It will print this: <code>google-chrome-stable_current_i386.deb file's version is:
15.0.874.106</code>.
<p>
Because the output is separated by a ":", in other words into tokens, you can use it in
a script to get just the version number.  <code>./getversion.sh | awk -F: '{print $2}</code> will output just the
number like this: <code> 15.0.874.106</code>.
<p>
You could put that functionality within the script and signal it with a command line
flag, but I haven't bothered.  I leave it as an exercise for the reader.
<p>
Here is the script:
<p>
<div class="pygmentize"><pre><span class="c">#!/bin/sh</span>
<span class="k">if</span> <span class="o">[</span> <span class="s2">&quot;x$1&quot;</span> <span class="o">=</span> x-h <span class="o">]</span> ; <span class="k">then</span>
<span class="k">    </span>cat <span class="s">&lt;&lt;-END_HELP</span>
<span class="s">    This extracts the version number of google-chrome from the deb file specified on the command line</span>
<span class="s">    or in the current directory if none is specified. (Cobbled from the Slackbuild script.)</span>
<span class="s">END_HELP</span>
    <span class="nb">exit</span>
<span class="k">fi</span>
<span class="nv">RELEASE</span><span class="o">=</span><span class="k">${</span><span class="nv">RELEASE</span><span class="k">:-</span><span class="nv">stable</span><span class="k">}</span>    <span class="c"># stable, beta, or unstable</span>
<span class="k">case</span> <span class="s2">&quot;$(uname -m)&quot;</span> in
  i?86<span class="o">)</span> <span class="nv">DEBARCH</span><span class="o">=</span><span class="s2">&quot;i386&quot;</span> ; <span class="nv">LIBDIRSUFFIX</span><span class="o">=</span><span class="s2">&quot;&quot;</span> ; <span class="nv">ARCH</span><span class="o">=</span>i386 ;;
  x86_64<span class="o">)</span> <span class="nv">DEBARCH</span><span class="o">=</span><span class="s2">&quot;amd64&quot;</span> ; <span class="nv">LIBDIRSUFFIX</span><span class="o">=</span><span class="s2">&quot;64&quot;</span> ; <span class="nv">ARCH</span><span class="o">=</span>x86_64 ;;
  *<span class="o">)</span> <span class="nb">echo</span> <span class="s2">&quot;Package for $(uname -m) architecture is not available.&quot;</span> ; <span class="nb">exit </span>1 ;;
<span class="k">esac</span>
<span class="nv">FILE</span><span class="o">=</span><span class="s2">&quot;google-chrome-${RELEASE}_current_${DEBARCH}.deb&quot;</span>
<span class="k">if</span> <span class="o">[</span> <span class="nv">$# </span>-gt 0 <span class="o">]</span> ; <span class="k">then</span>
<span class="k">    </span><span class="nv">FILE</span><span class="o">=</span><span class="s2">&quot;$1&quot;</span>
<span class="k">fi</span>
<span class="k">if</span> <span class="o">[</span> -f <span class="s2">&quot;$FILE&quot;</span> <span class="o">]</span> ; <span class="k">then</span>
    <span class="c"># Get the version from the Debian/Ubuntu .deb (thanks to Fred Richards):</span>
    <span class="nv">VERSION</span><span class="o">=</span><span class="k">$(</span>ar p <span class="nv">$FILE</span> control.tar.gz 2&gt; /dev/null | tar zxO ./control 2&gt; /dev/null |
		grep Version | awk <span class="s1">&#39;{print $2}&#39;</span> | cut -d- -f1<span class="k">)</span>
    <span class="nb">echo</span> <span class="s2">&quot;$FILE file&#39;s version is: $VERSION&quot;</span>
<span class="k">else</span>
<span class="k">    </span><span class="nb">echo</span> <span class="s2">&quot;$FILE not found.&quot;</span>
<span class="k">fi</span>
</pre></div>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=chromeversion%21201111090808%21unix%2Cinternet/feed/</wfw:commentRss>
</item>
<item>
<title>Decoding a Spammer's Attempt to Obfuscate His IP Address
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=ipdottedquad%21201109181847%21internet%2Cunix%2Cgeneral</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=ipdottedquad%21201109181847%21internet%2Cunix%2Cgeneral#comments</comments>
<pubDate>Sun, 18 Sep 2011 18:47:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>internet</category>
<category>unix</category>
<category>general</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=ipdottedquad%21201109181847%21internet%2Cunix%2Cgeneral/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
<div class="summary">IP v4 addresses are familiar as dotted quads.  A spammer uses an
interesting feature of IP addressing to obfuscate his address.  A look at the
various ways of specifying an IPv4 address.</div>
<p>
I received a spam email yesterday that was phishing for some banking details.  It
contained the usual "your account has been disabled, you need to reactivate it" spiel,
along with a link to click.  When I hovered the cursor over the link, it displayed the
bank's URL.  Or did it?
<p>
<div class="image">
<img src="http://www.nickcoleman.org/blog/images/phishing.jpg" alt="URL address shown on
hover" width="446" height="33"><p>URL displayed when hovering over the link</div>
<p>
A cursory glance shows the link's URL pointing to <b>firstdirect.com</b>.  There is
some other stuff before the <i>firstdirect.com</i>, but I wonder if many people would
query it, especially since it appears to contain the quite-common <i>www1</i> prefix.
<p>
If we look closely, however, we can see that the actual domain part of the URL is
<i>95.11064393</i>, which is followed by a directory of <i>www1.firstdirect.com</i>.
There is a "/" almost hidden between the two parts.  It is quite easy to overlook, which is
probably the phisher's intent.
<p>
We all know this is a common technique of phishers: make the bank name <i>appear</i> as
though it is part of the domain name.  The bit that I found interesting was the <em>domain</em>.
It is not a fully-qualified domain name; it doesn't have a .com, .org, etc, so it must
be an IP address.  But it is not in the familiar dotted quad notation of
123.123.123.123.  How does it work?
<p>
<h2>IP Addresses Revisited</h2>
<p>
An IP address is simply a number, a 32 bit number.  It can be represented in dotted-decimal quad
notation, the most common, or it can be in decimal, hexadecimal, octal or even binary.
For dotted quad, the four component parts are just the number × 256<sup>n</sup>, where n
is 3 for the left-most part and 0 for the right-most part.
<p>
Therefore, for IP address 74.125.237.20 (a Google server), the decimal equivalent is
<div class="code">
<p>
<b>74</b>×256<sup>3</sup> + <b>125</b>×256<sup>2</sup> + <b>237</b>×256<sup>1</sup> + <b>20</b>×256<sup>0</sup> = 1249766676
</div>
and we can test that by pinging it:
<p>
<div class="code">
$ ping 1249766676
PING 1249766676 (74.125.237.20) 56(84) bytes of data.
64 bytes from 74.125.237.20: icmp_req=1 ttl=53 time=65.1 ms
^C
</div>
Here we see that pinging 1249766676 tells us two things:
<ol><li>it is a valid address since the server does respond;</li>
<li>an IP address works for decimal addresses as well as dotted quad
addresses.</li></ol>
It also works for hexadecimal. 1249766676 is 0x4a7ded14 in hexadecimal:
<p>
<div class="code">
$ ping 0x4a7ded14
PING 0x4A7DED14 (74.125.237.20) 56(84) bytes of data.
64 bytes from 74.125.237.20: icmp_req=1 ttl=53 time=63.6 ms
^C
</div>
and it works.
<p>
<h2>Interesting</h2>
<p>
Now it gets interesting.  Up to now, we have converted numbers in their entirety.  You
can also convert them and <b>still use dotted quad notation</b>. So you can convert
74.125.237.20 to dotted quad hex, which is 0x4a.0x7d.0xed.0x14 and it works:
<div class="code">
$ ping 0x4a.0x7d.0xed.0x14
PING 0x4A.0x7D.0xED.0x14 (74.125.237.20) 56(84) bytes of data.
64 bytes from 74.125.237.20: icmp_req=1 ttl=53 time=61.9 ms
</div>
and you can even mix them up with some quads in decimal and some in hexadecimal:
<p>
<div class="code">
$ ping 0x4a.125.0xed.20
PING 0x4A.125.0xED.20 (74.125.237.20) 56(84) bytes of data.
64 bytes from 74.125.237.20: icmp_req=1 ttl=53 time=62.5 ms
</div>
<p>
<h2>More Interesting</h2>
<p>
To recap, you can use a variety of number bases such as 2, 8, 10 and 16 to specify an IP
address, and you can do that with the entire number or with each dotted quad separately.
Now we get to the really interesting bit.  
<p>
You can <b>use part dotted quad and part whole number</b>.  This is what the phisher has done.
The IP address of <b>95.11064393</b> is using the first quad and then an entire number for the
second, third and fourth quads.  It translates to: 
<div class="code">
95×256<sup>3</sup> + 11064393 = 1604899913
$ ping 1604899913
PING 1604899913 (95.168.212.73) 56(84) bytes of data.
64 bytes from 95.168.212.73: icmp_req=1 ttl=52 time=299 ms
^C
</div>
The spammers IP address of 95.11064393 translates to 95.168.212.73. Who is that?
<p>
<div class="code">
$ whois 95.168.212.73
inetnum:        95.168.212.64 - 95.168.212.127
netname:        SUPERNETWORK-STUDIO51-3
descr:          Studio51 s.r.o.
country:        CZ
</div>
is a server in the Czech Republic.
<p>
For those who want the math of converting the number to dotted quad, it is just taking
successive integer division of decreasing powers of 256.  It sounds more complicated
than it is, so here is the bc<a href="#fn1" id="back1"><sup>1</sup></a> script (<i>scale</i> in bc sets
the number of decimal points; we want whole numbers so set it to 0): 
<div class="code">
$ echo 'scale=0; a=11064393; x=a/256^2; b=a%256^2; y=b/256; z=b%256; 
  print x,".",y,".",z,"\n"' | bc
168.212.73
</div>
which you can see is the second, third and fourth quads that are shown in the
<b>ping</b> above.
<p>
<h2>Seriously</h2>
<p>
I had planned to put a link to the various RFCs for IP addressing, but, because the IP
protocol is so old and various bits have been changed many times, it gets too confusing,
not least to me.  Instead, 
<a href="http://www.nickcoleman.org/axs/ax.pl?http://en.wikipedia.org/wiki/IPv4#Addressing ">here is a link</a>
to a good article on Wikipedia about IP addressing.
<p>
The main idea to take away is that IP addresses are simply a 32 bit number, and the
various utilities that we use, such as ping and our browsers, are capable of accepting
that number in a wide variety of formats, some them quite surprising.
<p>
As for our spammer, I guess he was clever in trying to obfuscate the URL so that it looks
less like an IP address.  It didn't work though.
<p>
<hr>
<div class="footnote">
All IP addresses in this post refer to IPv4.  IPv6 is a different kettle of fish.
</div>
<p>
<div class="footnote">
<a href="#back1" id="fn1">[1] </a>bc is a unix calculator utility <a href="#back1">&#8593;</a>
<p>
I used bc to calculate all the numbers for this post, such as this example for
hexadecimal output: 
<code>
<br>$ echo 'obase=16; a=74*256^3+125*256^2+237*256+20; print "0x",a,"\n"' | bc
0x4A7DED14</code>
</div>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=ipdottedquad%21201109181847%21internet%2Cunix%2Cgeneral/feed/</wfw:commentRss>
</item>
<item>
<title>Prey: Replace Streamer with Mplayer
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=preymplayer%21201109111035%21internet%2Cunix%2Cgeneral</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=preymplayer%21201109111035%21internet%2Cunix%2Cgeneral#comments</comments>
<pubDate>Sun, 11 Sep 2011 10:35:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>internet</category>
<category>unix</category>
<category>general</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=preymplayer%21201109111035%21internet%2Cunix%2Cgeneral/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
<div class="summary">Prey is a recovery application for stolen mobile devices that,
among other things,  takes a photo through the webcam.  It uses streamer, which can be
problematic.  Replace streamer with mplayer.</div>
<p>
<a href="http://www.nickcoleman.org/axs/ax.pl?http://preyproject.com/">Prey</a> is an
application that can help you track your mobile device if it is stolen.  It is pretty
neat and I have it installed on my laptop.  
<p>
It does its job by checking in with a central server every so often.  If your device is
stolen, you mark it on the server (using another computer) and then prey, upon checking
in and realising the device is marked as missing, will run its recovery routine.  It
gathers information about the device's IP address, GPS location if available, MAC
address and so on, takes a snapshot of the screen, and takes a photo through the
device's camera.  It bundles that up and sends it to the central server, or you can tell
it to send to you in an email.
<p>
Armed with that information, you can set about recovering your device.
<p>
Prey is available for all the major platforms: Windows, Mac and Linux.  It uses common
unix utilities (compiled especially in the case of Windows).  For Linux it
assumes that those utilities are already available on the system, which for the most
part they are.  However, streamer, the webcam photo taker, is not.  Many distributions
have it as part of the xawtv package.  That means you have to pull in the entire xawtv
package and dependencies just for one application, streamer.
<p>
I tried building xawtv on my Slackware 13.37 machine and couldn't without having to jump
through hoops, which I didn't feel like doing just to get one application out of it.
Streamer itself built ok, but it couldn't find the webcam to grab. I don't know if that
was because of the build problems or was something else, but I was not inclined to
investigate because there is an alternative.
<p>
Every major Linux distribution has access to an application that is quite capable of
taking photos: <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.mplayerhq.hu/design7/news.html">mplayer</a>.
I decided to see how easy it would be change streamer to mplayer. 
<p>
Let's find out where streamer is used. 
<div class="code">
prey $ grep -r -m 1 streamer *
README: o streamer (for webcam capture) --> in Fedora the xawtv package includes it
modules/webcam/config:# frames per second (for streamer in linux)
modules/webcam/platform/linux/functions:        # do we have streamer installed ?
</div>
That is good: streamer is in the README, a config file, and only in one source file.
That is encouraging; this should be doable, time to see how easy it is to take photos
with mplayer. 
<p>
I've done this before so I know the options for mplayer:
<div class="code">
mplayer -tv driver=v4l2:fps=2 -vo jpeg -vf framestep=10 -frames 10 tv://</div>
From left to right, this sets up the streaming video capture module "tv", uses the
webcam driver "v4l2" (a common one for many webcams) and sets it to 2 frames/second,
sets the output to jpeg, renders every 10 frames which is 5 seconds worth, saves 10
frames, and uses the tv: device.  
<p>
I use framestep to give the webcam time to warm up.  Many cams need a second or two to
get the brightness and contrast levels correct and they futz around during that time.
We need the camera to be switched on for at least a couple of seconds, and we take 10
snapshots and use one of the later ones.
<p>
Now back to prey.  I know from the grep above that streamer is used in file
<i>modules/webcam/platform/linux/functions</i>.  Here is a listing:
<div class="code">
 #!/bin/bash
 ####################################################################
 # Prey Webcam Module Linux Functions - by Tomas Pollak (bootlog.org)
 # URL: http://preyproject.com
 # License: GPLv3
 ####################################################################
<p>
take_picture() {
    # do we have streamer installed ?
    local streamer=`which streamer`
    if [ -n "$streamer" ]; then
	# take four pictures every 0.5 seconds as JPEG
	$streamer -t 4 -r 0.5 -o "$tmpdir/streamer0.jpeg" &> /dev/null
	if [ -f "$tmpdir/streamer3.jpeg" ]; then # we got it
	    mv "$tmpdir/streamer3.jpeg" "$webcam__picture" > /dev/null
	    rm -f "$tmpdir/streamer{0,1,2}.jpeg" 2> /dev/null
	else # some webcams are unable to take JPGs so we try to grab a PPM
	    $streamer -t 4 -r 0.5 -o "$tmpdir/streamer0.ppm" &> /dev/null
	    if [ -f "$tmpdir/streamer3.ppm" ]; then # good
<p>
		local convert=`which convert`
		if [ -n "$convert" ]; then # lets convert it to jpg
		    $convert "$tmpdir/streamer3.ppm" "$webcam__picture" > /dev/null
		else # lets just send it as a PPM
		    log " -- Could't find Imagemagick! Sending image as PPM."
		    webcam__picture="$tmpdir/streamer3.ppm"
		fi
<p>
		rm -f "$tmpdir/streamer{0,1,2}.ppm"
	    fi
	fi
    fi
}
<p>
capture_video() {
    # we should already know if we do have streamer
    if [ -n "$streamer" ]; then
	local frames=$(( $webcam__video_capture_time * $webcam__frames_per_second ))
	$streamer -o "$webcam__video" -f yuv2 -F stereo -r $webcam__frames_per_second -t $frames
    fi
}
</div>
We can ignore function capture_video as it is not currently used in the default
configuration.  Looking at function take_picture, we can see that the streamer
application is set in a local variable, which is used throughout the rest of the script.
That makes it easy to substitute mplayer.  The other thing to change is the command line
options and the names of the jpegs.  We also can ignore the <i>else</i> clause because
mplayer <i>is</i> capable of taking jpegs so that clause will never be entered.
<p>
Here are the changes in the form of a diff:
<div class="code">
-   local streamer=`which streamer`
+   local streamer=`which mplayer`
    if [ -n "$streamer" ]; then
    	# take four pictures every 0.5 seconds as JPEG
-	$streamer -t 4 -r 0.5 -o "$tmpdir/streamer0.jpeg" &> /dev/null
+	$streamer  -tv driver=v4l2:fps=2 -vo jpeg:outdir=$tmpdir -vf framestep=10  -frames 10  tv:// 1>/dev/null 2>&1
-	if [ -f "$tmpdir/streamer3.jpeg" ]; then # we got it
-		mv "$tmpdir/streamer3.jpeg" "$webcam__picture" > /dev/null
-		rm -f "$tmpdir/streamer{0,1,2}.jpeg" 2> /dev/null
+	if [ -f "$tmpdir/00000005.jpg" ]; then # we got it
+		mv "$tmpdir/00000005.jpg" "$webcam__picture" > /dev/null
+		rm -f "$tmpdir/000000??.jpg" 2> /dev/null
 	else # some webcams are unable to take JPGs so we try to grab a PPM
</div>
<p>
The changes are: change the application to mplayer, change the application's command
line options, change the names of the image files to the ones that mplayer generates.
<p>
I am using the 5th jpeg because, on my webcam, it is usually the first one where the
camera has stabilised.  You might experiment with your camera to check.  You probably
don't want to use the last image, number 10, in case the user/thief notices the camera
light come on and covers up the camera.
<p>
I've truncated the code listings above to make them easier to read here; as a result the diff
above is not a real one.  You can <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/dropbox/prey_mplayer.patch">download
the actual patch</a> and apply it.  Cd to prey/modules/webcam/platform/linux, move the patch
file here, and apply it with <code>patch -p1 < prey_mplayer.patch</code>
<p>
I hope you find this useful.  With any luck, you will never need it.
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=preymplayer%21201109111035%21internet%2Cunix%2Cgeneral/feed/</wfw:commentRss>
</item>
<item>
<title>Australia Debate on Cyber Data Retention
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=cyberlaw%21201108301015%21politics%2Cinternet%2Ccensorship%2Claw</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=cyberlaw%21201108301015%21politics%2Cinternet%2Ccensorship%2Claw#comments</comments>
<pubDate>Tue, 30 Aug 2011 10:15:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>politics</category>
<category>internet</category>
<category>censorship</category>
<category>law</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=cyberlaw%21201108301015%21politics%2Cinternet%2Ccensorship%2Claw/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
Australia is "debating" changes to the law that will force ISPs and telecommunication
companies to keep data on their users.  I put "debating" in quotes, because the
government is forcing the bills through without any sort of meaningful public input.
<p>
The bills are a disgrace.  As one commentator (see link below) says, we don't accept
that the government can open our letters and read them, so why should email or text
messages be any different? We also haven't been told whether a history of our day-to-day
browsing the Web will be kept, or for how long, or who would have access to that
information.
<p>
The trouble with broad-reaching legislation is that, despite reassuring comments at
first, the legislation inevitably gets used in the widest possible way, much beyond what
the original intent was.  For this reason alone, we should be concerned about it.
<p>
In the interests of fairness, I link to <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.theage.com.au/opinion/politics/cyber-law-casts-the-proper-net-20110829-1jib6.html"><i>The
Age</i>'s opinion piece</a> by Robert McLelland, the Federal Attorney-General, who is
replying to a <a href="http://www.nickcoleman.org/axs/ax.pl?http://www.theage.com.au/technology/technology-news/critics-label-cybercrime-bill-invasion-of-privacy-20110818-1j03s.html">previous critical piece</a>.  Make sure you read the comments to McLelland for several very good
reasons why the legislation should be rejected.
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=cyberlaw%21201108301015%21politics%2Cinternet%2Ccensorship%2Claw/feed/</wfw:commentRss>
</item>
</channel>
</rss>

