<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Kukkaisvoima version 9" -->
<rss version="2.0"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
>
<channel>
<atom:link href="http://www.nickcoleman.org/blog/index.cgi/feed" rel="self" />
<title>Nick Coleman: image</title>
<link>http://www.nickcoleman.org/blog/index.cgi</link>
<description>Nick Coleman blog</description>
<pubDate>Tue, 21 Jun 2011 08:46:00 -0700</pubDate>
<lastBuildDate>Tue, 21 Jun 2011 08:46:00 -0700</lastBuildDate>
<generator>http://23.fi/kukkaisvoima/</generator>
<language>en</language>
<item>
<title>A Script to Download Webcam Images
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=getwebcam%21201106210846%21programming%2Cunix%2Cimage</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=getwebcam%21201106210846%21programming%2Cunix%2Cimage#comments</comments>
<pubDate>Tue, 21 Jun 2011 08:46:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>programming</category>
<category>unix</category>
<category>image</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=getwebcam%21201106210846%21programming%2Cunix%2Cimage/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
<div class="summary">Using a script to pull webcam images; the script uses a file with a
list of URLs, the time to pull the images, and any conversions needed when the URL is
time or date specific.</div>
<p>
I <a
href="http://www.nickcoleman.org/blog/index.cgi/openbsdinstall%21201104220940%21unix">wrote
previously</a> about setting up an iMac G4 to display photos and images.  The images are
from webcams around the world, pulled every few hours and displayed automatically.  Now
I go into some details about how I do that. (You can download the script at <a
href="#download">the bottom of this page</a>.
<p>
What factors do you need to take into account when getting a webcam image?  You need a
URL.  You might decide you want to get the image only during daylight, so you need some
method of telling the script whether to download the image on this run or not.  Some
webcam URLs have a date or time embedded in them, so you need some way of generating
that URL or of picking the latest image.  You might want to display a caption
with the image. Finally, you want to be able to rename the image since you could find
name collisions (e.g. many websites use "webcam.jpg"), so you want to specify a name for
the image.
<p>
The easiest way to deal with all the alternatives was to separate the script from the
data. I can then add and delete webcam URLs from the data without affecting the script.
Plus, I can develop the script without affecting the list of webcams.
<p>
<h2>Data</h2>
Here is the format for the data file:<br/>
<code> URL | Caption | DATEADJ flag | UPDATETIME | NAME</code>
<p>
The UPDATETIME could have a format like "07-13", which would mean retrieve only
between 7am to 1pm.  Or it could have "06-09,17-19", which means retrieve only between
6am to 9am or 5pm to 7pm (i.e. sunrise and sunset).
<p>
The DATEADJ flag is an indicator for the script to take special action on the URL.  For
example, a webcam's URL might be "http://www.nickcoleman.org/webcam/my-image-11062111520" (which could indicate an
image taken on 2011-06-21 11:15:20).  The problem is that we don't know the URL time
component in advance; often it is not consistent, so the next image, assuming an image
every 15 minutes, might be 11:30:45, not 11:30:20. This makes it impossible to predict
the URL and so we need a special way to handle them.
<p>
<h2>Script</h2>
The script's workflow would be something like this:
<ul style="margin-left:0px; padding-left:2em"><li>read the data file line by line, and for each line:</li>
<ul style="margin-left:0px; padding-left:2em"><li>check UPDATETIME, and if the current time is within that band:</li>
<ul style="margin-left:0px; padding-left:2em"><li>check the DATEADJ flag, and if set:</li>
<ul style="margin-left:0px; padding-left:2em"><li>manipulate the URL to get the current, latest image URL</li></ul>
<li>pull down the image, renaming it to NAME</li>
<li>add the caption</li>
</ul>
</ul>
</ul>
<p>
<h3>Reading the data file</h3>
<p>
Reading the data file is done in a simple while loop:
<div class="code">
while IFS=\| read URL COMMENT DATEADJ UPDATETIME NAME
do
 ...
done < webcam_file
</div>
where webcam_file is the data file. Specifying the IFS (Input Field Separator =
"|") as part of the <code>while</code> loop prevents problems in other parts of
the script where the shell may perhaps expect the IFS to be normal whitespace; it is the
recommended method.
<p>
Sidenote:  the while loop isn't testing the IFS, it's testing the 'read' command.
Recall that a command in the shell takes the form <code>[var1=... [var2=... [varn=...]]]
command</code>, where <code>varX</code> are optional environment variables. Here, the IFS is
an optional environment variable and is being set to '|', and then the <code>read</code>
command is being executed.
<p>
<h3>Handling retrieval times</h3>
<p>
I wanted to keep this simple and not too complicated.  Remember that the retrieval_time
can be "08-12" or "08-12,16-20" or similar.  I decided that two bands was enough for my
purposes, so:
<div class="code">
band1=$(echo $UPDATETIME | cut -d , -f 1)
band2=$(echo $UPDATETIME | cut -d , -f 2)
 # band3, band4, etc
 # band3=$(echo $UPDATETIME | cut -d , -f 3)
 # 'cut' doesn't give an empty field if the field is not found, 
 # it gives the previous field, so force it empty.
    if [ $band1 == $band2 ] ; then
        band2=""
    fi
</div>
The script tests and corrects for an empty field from 'cut', which would be the case if
the retrieval time value was the singleton "08-12".  GNU 'cut' works ok since it returns
the empty string if a field is not found, but the unix variants return the last field
found so they must be fixed.
<p>
It handles two bands, which I've found sufficient, but more can be added easily enough.
<p>
Next test whether the current retrieval time is within one of the bands.  I use this
snippet within a function: 
<div class="code">
 # TRUE and FALSE are defined elsewhere (viz. `false`;FALSE=$?)
 # To avoid complicated if...then clauses, if a true is found, 
 # return immediately, otherwise keep processing. 
for time in $band1 $band2 ; do
    # reached an empty time without finding a TRUE, so exit;
    if [ -z $time ] ; then
	break;
    fi
    # start = the prefix (e.g. of 12-24, 12)
    # end =  the suffix (e.g. of 12-24, 24)
    start=${time%%-*}
    end=${time##*-}
<p>
    # Handle case where start -> end crosses midnight
    # e.g. 22 -> 5: success with hour at 23 or 4; fail with hour at 21 or 6
    if [ $start -ge $end ] ; then
	if [ $start -le $HOUR -o $HOUR -le $end ] ; then
	    return $TRUE
	fi
    # handle normal case where start -> end is within one day
    else
	if [ $start -le $HOUR -a $HOUR -le $end ] ; then
	    return $TRUE
	fi
    fi
done
return $FALSE
</div>
This gets the start and end of each band and tests whether the current HOUR is
within one of those bands, taking into account the transition over midnight
where hour changes from 23 to 0.
<p>
It might look like the function is open-ended on its returns, but bear in mind that we
are looking for any instance of the HOUR being within the bands and, if it is then
return TRUE immediately, we don't need to examine any other cases. If we don't find any
case then return FALSE.
<p>
<h3>Manipulating URLs</h3>
<p>
I had a hard time figuring how to deal with URLs with an embedded datetime.  The problem
is that you don't know the URL before you go to retrieve it, so you have to somehow
figure out what the actual URL is beforehand.
<p>
After sweating on this for a few hours and trying a bunch of date and time arithmetic
solutions that didn't work across all webcams, I found the easiest solution was to pull
the web page itself (rather than the image directly) and then parse the page's source to
find the image's URL.  What we are looking for is something associated with the image,
but which doesn't change when the image's URL changes.  This could be a nearby &lt;div&gt;
tag, or perhaps an id or class name for the &lt;img&gt; tag.
<p>
This approach saves having to deal with any awkward date and time arithmetic, or any
knowledge of how often the image is updated.  
<p>
It actually works very well and shouldn't be considered half-baked.  It is versatile,
plus you will often find a simple cut-n-paste approach works when you go to add a new
webcam.  For example, many times the webcam's URL will be contained within a &lt;div&gt;
which can be searched for.  Here's a common one:
<div class="code">
...
&lt;div class="main"&gt;
&lt;img src="http://www.somewhere.com/webcams/20110622_010053_rp.jpg" /&gt;
&lt;/div&gt;
...
</div>
You can get the image's URL easily with a simple awk script:<br />
<code>URL=$(awk 'BEGIN { FS="\"" } /div class="main/ {getline; print $2;} ' website.html)</code><br />
The field separator is set to ", then the source is searched for 'div class="main', then
get the next line and print the second field.  
<p>
Another common one is where the image tag has a id or name associated with it:<br />
<code>URL=$(awk 'BEGIN { FS="\"" } /img name="anim/ {  print $4;  } ' website.src)</code><br />
where the image tag has a name starting with "anim".
<p>
You can then pull the actual image, rename it and continue.
<p>
<h3>Pulling the image</h3>
<p>
I use wget as it easily allows the output to be renamed with the -O flag.  I found that
specifying --prefer-family=IPv4 reduces DNS resolving delays, and --timeout=300 seconds
(5 minutes) with --tries=3 (the default is 20) lets the script skip over non-responding
server without a long delay. [Addendum: In the script, I actually set the three
available timeouts (dns, server, read) to different values to speed it up.]
<p>
<h3>Handling errors</h3>
<p>
You will often find a webcam is having network problems or that the URL has changed, so
we need some method of reporting errors.  Here are a couple of tricks I used to make
error reporting easier.
<p>
The first is to set up an array variable where each index is the error code and each
value is the error message.  This works in the following way: if the error code returned
is 1, print out the array[1], which you have previously set up to be the error message
for that error code.  Here is wget's:
<div class="code">
set -A WGETERRCODE   "No problems occurred." \
    "Generic error code."  \
    "Parse error, e.g. command-line option" \
    "File I/O error." \
    "Network failure." \
    "SSL verification failure." \
    "Username/password authentication failure." \
    "Protocol errors." \
    "Server issued an error response."
</div>
Wget is easy because each error code is sequential, which makes it easy to set up the
array; you may have to fiddle with the indices if the error codes for your application
are not sequential.
<p>
The second is to concatenate individual error messages into one variable and then echo
that variable at the end of the script.  Here is a snippet using wget: 
<p>
<div class="code"> 
wget  --quiet --tries=3 ...etc 
returncode=$?  
if [ $returncode -ne 0 ] ; then 
    WGETERRORS=$WGETERRORS"\nWget failed with $URL, \ 
    error code is $returncode:
    ${WGETERRCODE[$returncode]}" 
fi </div> 
<p>
If wget returns a non-zero error code, then append a message with the error code and the
error message to the variable WGETERRORS.  (The shell has some upper limit for the
number of characters in a variable, but it almost certainly is at least a few hundred KB;
presumably you could hit that limit, but in that case you have bigger problems such as
the network being down.)
<p>
At the end of the run, the script does a simple <code>echo $WGETERRORS</code> and you can see all
the URLs that wget could not fetch. Easy. 
<p>
<a id="download"></a>
<h3>Download</h3>
<p>
I've put a snapshot of the script that you can <a
href="http://www.nickcoleman.org/downloads/wget-images">download here</a>.  If you want to
keep up to date with any developments, I keep a copy on <a
href="https://github.com/ncoleman/webcam">github here</a>; grab the <code>wget-images</code> file, the
others are dummy files that are only relevant on my local system.
<p>
Just note there is some lines at the end of the script that are specific to my system,
namely running feh automatically to display the images.  As well, you will see it is
written for ksh.  It probably will translate directly for bash except for the one array
variable, <code>set -A WGETERRCODE ...</code>. Use <code>declare -a WGETERRCODE=("..."
"..." "..." etc)</code> instead (<code>man bash</code> and refer to the section on
Arrays). 
<p>
<h3>Conclusion</h3>
<p>
This is a long post because it covers a lot of ground.  I hope you get some use out of
the ideas here.  I value emails or comments on anything here.
<p>
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=getwebcam%21201106210846%21programming%2Cunix%2Cimage/feed/</wfw:commentRss>
</item>
<item>
<title>NASA's Mars Spirit Stops
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=nasaspiritrestingplace%21201001051737%21nasa%2Cscience%2Cimage</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=nasaspiritrestingplace%21201001051737%21nasa%2Cscience%2Cimage#comments</comments>
<pubDate>Tue, 05 Jan 2010 17:37:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>nasa</category>
<category>science</category>
<category>image</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=nasaspiritrestingplace%21201001051737%21nasa%2Cscience%2Cimage/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
You may know that NASA's Mars Exploration rover <i>Spirit</i> has been bogged in one
place since about April 2009.  It broke through the crust into soft sand, got bogged and
then one wheel stopped working.
<p>
<div class="image"><a href="http://www.nickcoleman.org/blog/images/nasa_rover.jpg" target="_blank">
<img src="http://www.nickcoleman.org/blog/images/nasa_rover_th.jpg" alt="NASA Mars Rover Spirit" ></a>
Copyright NASA
<br />Click for larger image in new window
</div>
<p>
From what I read, they're not hopeful of freeing it. The issue now is that dust is
accumulating on the solar panels and, with winter approaching, they are not confident
that the batteries will recharge.  Which means the end of it.
<p>
It's too early to call for closing-drinks, but it's getting near.  
<p>
Interesting to see a plastic shopping bag, and a well-used track heading off to the local
shops and a pub, no doubt. That's where we'll have the last drinks.   
<p>
It's done a great job.  It's design brief was for an extendable 90
days, and it has lasted for six years.  The engineers should be and no doubt are proud of
what they've made.
<p>
Let's remember that this was one of NASA's cheapest missions and one of its most
productive.  Surely robot missions are the future for exploratory work.  
<p>
This image is NASA's Image of the Day for 4 Jan 10 and is
quite poignant.
<p>
<b>Image Caption</b>
<p>
Click on the image for a larger version.  You can see the caption text that I wrote about
in a <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi/nasaiod!200912210830!unix,nasa,image">previous post</a>.
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=nasaspiritrestingplace%21201001051737%21nasa%2Cscience%2Cimage/feed/</wfw:commentRss>
</item>
<item>
<title>Adding a caption to NASA's Image of the Day
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=nasaiod%21200912210830%21unix%2Cnasa%2Cimage</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=nasaiod%21200912210830%21unix%2Cnasa%2Cimage#comments</comments>
<pubDate>Mon, 21 Dec 2009 08:30:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>unix</category>
<category>nasa</category>
<category>image</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=nasaiod%21200912210830%21unix%2Cnasa%2Cimage/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
I recently <a
href="http://www.nickcoleman.org/axs/ax.pl?http://www.nickcoleman.org/blog/index.cgi/textimage!200912200832!unix,image">wrote</a>
about converting text to an image, and gave a brief example using image
magick. To recap, here's the command-line:
<div class="code">convert -fill white -background '#007a7a' -gravity "West" -size 300x50 
            caption:"address@domain.toplevel" email.jpg</div>
<p>
One very useful way of using this text-to-image feature is to add a caption to
an image.  We will go through this and use NASA's Image of the Day as an
example.  (In actuality, image of the working week give or take a day.  No image on the
weekend, I'm afraid.)
<p>
If you're not familiar with it, NASA's Image of the Day is a high-scale photo
released by NASA's PR department, usually chosen for its stunning imagery or
its relevence to a science mission.  It's been going for years and is a bit of
an institution among the astronomy and science set.
<p>
The image used to be available on it's own page, along with some explanatory text and photo
credits below.  Having its own page meant that you could store a simple
bookmark, which would never change, to view it every day.
<p>
Before long, people were automatically downloading the image using wget
or an ftp script and putting it on their desktop.  A nice new photo every day.
<p>
Nowadays, in the curse of multimedia everywhere, there is no consistent link.
You have to visit NASA's multimedia gallery and scroll through the list of
images to see the latest.  Each image has its own URI, and there is no link
to that entity previously known as "the image of the day".
<p>
However, NASA does provide an RSS feed that contains a link to the latest
image and a caption.  That's great, we can use the contents of the feed both to pull down the image and
to overlay the caption.
<p>
One of image magick's very useful features is its ability to work on several
layers at once.  We will use that feature to create a caption on one layer using
the same technique as in my previous post, and we add the image itself to
another layer.  Then flatten the layers, resize the image to suit our desktop,
and save it.  Presto, we have a nice new desktop image every day.
<p>
Here is a sample of bash code (easily alterable to any command-line, including
DOS):
<p>
<div class="code">
 # Get the rss file that contains the direct link to the image.
 # It is always  #lg_image_of_the_day.rss #.  lg_ means large, there are
 # other sizes too: see the NASA website section on feeds.
wget -N http://www.nasa.gov/rss/lg_image_of_the_day.rss
<p>
 # Parse it to get the actual image URI, and use tail to get the
 # last line.
 # Search for the text "url" and grab what follows.
image_url=$(sed '{s/^.*image\/jpeg \" url=\"\(.*\)\"\/> <\/item><\/channel>
                .*/\1/;}' lg_image_of_the_day.rss | tail -n 1)
<p>
 # Get the image itself.
wget -N $image_url
<br />... various techniques to calculate resizing, both for the desktop
    and for the aspect ratio.
    I'll post them if people are interested...<br />
 # Parse the rss to extract the caption (in XML tag &lt;description&gt;) and
 # save it. This is searching for the tag and getting everything after, up
 # to some arbitrary text limit, here the photo credit.  I had to set a
 # limit as sometimes the text runs for several hundred characters.
 # An alternative method would be just to grab a certain number
 # of characters.
sed 's%.*<description>\(.*\)\(Photo\|Image\) Credit.*%\1%' \ 
lg_image_of_the_day.rss | tail -n 1 > NasaIod.txt
<p>
 # And finally convert the text to an image, overlay it on the photo image
 # and save it.  Saving it to a non-layer format flattens the 
 # layers automatically.
convert \
    \( -fill white -background '#0008'  -gravity NorthWest -size \
                $width caption:"@NasaIod.txt" \) \
    \(  $image $resize \) \
    +swap -gravity South -composite image.jpg
</div>
The first set of parantheses creates one layer and the second set a
second layer.
<p>
The interesting part in the last statement is the
<code>caption:"@NasaIod.txt"</code>.  This reads a file containing text, which
we created previously when we parsed the rss.
<p>
I'll quickly go through the image magick options(the 'convert' command which
forms part of the image magick suite of programs).  
<li>We put the text into the
first layer rather than the second, then '+swap' them so the text is on the
top layer.  There is a reason for this, but I can't remember it (so shoot me).
I seem to recall that, if the text layer is placed on top of the photo layer,
it resizes to fit, making it too large.  
<li>'-gravity' places the smaller text layer at the bottom (South) of the
larger photo layer.
<li>'-composite' merges the two layers.
<p>
I haven't gone into too much detail on the rest of the script, assuming you understand sed and some
bash.  I'll explain it further if anyone requests it.
<p>
I have put this entire process into a script that completely automates it. It can
handle tall narrow images, adjusting the caption width to take 
aspect-ratio into account.  I run it with cron every morning and have a nice
new desktop image most days.
<p>
And, the NASA images are simply stunning!
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=nasaiod%21200912210830%21unix%2Cnasa%2Cimage/feed/</wfw:commentRss>
</item>
<item>
<title>Making an Image of Text
</title>
<link>http://www.nickcoleman.org/blog/index.cgi?post=textimage%21200912200832%21unix%2Cimage</link>
<comments>http://www.nickcoleman.org/blog/index.cgi?post=textimage%21200912200832%21unix%2Cimage#comments</comments>
<pubDate>Sun, 20 Dec 2009 08:32:00 -0700</pubDate>
<dc:creator>Nick</dc:creator>
<category>unix</category>
<category>image</category>
<guid isPermaLink="true">http://www.nickcoleman.org/blog/index.cgi?post=textimage%21200912200832%21unix%2Cimage/</guid>
<description><![CDATA[ 
 [...]]]></description>
<content:encoded><![CDATA[
<p>
Sometimes you need to make an image of text.  Recently, I needed to show my
email address in a place where I really didn't want to expose the text.
<p>
You can try and obfuscate it using well-known techniques such as "email at
address dot org", but this would be in a place where it could sit
for years, and I didn't want to expose it to ever-increasingly sophisticated
spam harvesters.
<p>
I decided that it was worth the small inconvenience to users to show the text
as an image.  As you probably know, the point of using an image is that there
is no text for a machine to harvest, yet a human can interpret it easily.  You
can see an example <a href="http://www.nickcoleman.org/blog/static/contact.html"> here</a> (use your back button to
return).
<p>
It is very easy to do, using <a href="http://www.nickcoleman.org/axs/ax.pl?http://www.imagemagick.org" title="Image
Magick link">Image Magick</a>.  Image magick is one of those amazing
tools that becomes more and more powerful as you use it and realise what it
can do.
<p>
For those interested, a text image like that is simplicity itself:
<pre>convert -fill white -background '#007a7a' -gravity "West" -size 300x50 
        caption:"  address@domain.toplevel  " email.jpg</pre>
<p>
The extra spaces around the address are purely for formatting, to provide a
nice border around the text.  Image magick can do it too, but I haven't yet
discovered how.
<p>
<!-- I had to use <code> within <pre>; <code> does line-wrapping, whereas
<pre> by itself made the line extend into the sidebar.  <code> by itself
doesn't appear in a nicely formatted box.  NB [later], this doesn't work.-->
<p>
I picked up the technique while going through some image magick tutorials,
especially those on adding captions to existing images.  I will be posting on
that soon.
<p>
]]></content:encoded>
<wfw:commentRss>http://www.nickcoleman.org/blog/index.cgi?post=textimage%21200912200832%21unix%2Cimage/feed/</wfw:commentRss>
</item>
</channel>
</rss>

