<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Jorge Arango &#187; GNU/Linux</title>
	<atom:link href="http://www.jarango.com/en/blog/category/gnulinux/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.jarango.com/en</link>
	<description>Information Architecture + User Experience Design</description>
	<lastBuildDate>Thu, 10 Mar 2011 20:11:57 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>How to run a word count on a website using free Unix tools</title>
		<link>http://www.jarango.com/en/blog/2006/08/08/how-to-run-a-word-count-on-a-website-using-free-unix-tools/</link>
		<comments>http://www.jarango.com/en/blog/2006/08/08/how-to-run-a-word-count-on-a-website-using-free-unix-tools/#comments</comments>
		<pubDate>Wed, 09 Aug 2006 06:33:38 +0000</pubDate>
		<dc:creator>jarango</dc:creator>
				<category><![CDATA[Globalization]]></category>
		<category><![CDATA[GNU/Linux]]></category>
		<category><![CDATA[Web Development]]></category>

		<guid isPermaLink="false">http://www.jarango.com/blog/?p=506</guid>
		<description><![CDATA[I love working on a Mac. My Powerbook is not the fastest computer in the world, but it works reliably and is virus- and malware-free. (Thus far.) And when you&#8217;re working on text-heavy document sets (such as websites), OS X&#8217;s Unix tools can be incredible time savers. An example: running a word count on a [...]]]></description>
			<content:encoded><![CDATA[<p>I love working on a Mac. My Powerbook is not the fastest computer in the world, but it works reliably and is virus- and malware-free. (Thus far.) And when you&#8217;re working on text-heavy document sets (such as websites), OS X&#8217;s Unix tools can be incredible time savers. </p>
<p>An example: running a word count on a published site. This is a request I get fairly frequently; translators usually want to know how much work they will need to do to translate a site from one language to another (eg. Spanish to English). Fortunately there are two Unix tools that can make this work very easy: lynx and wc.<br />
<!<del>-more</del>-><br />
The following is the sequence of commands I usually employ:</p>
<p>Open up the terminal and type the following:</p>
<p><code><br />
cd ~/Desktop<br />
mkdir sitename_com<br />
cd sitename_com<br />
</code></p>
<p>This creates a new folder called sitename_com on your Desktop, and then places you in it. Now type:</p>
<p><code><br />
lynx -traversal -crawl http://www.sitename.com<br />
</code></p>
<p><strong>lynx</strong> is an amazing command-line based web browser that does many things. Here we&#8217;re using it with the -traversal switch, which follows every link it finds in the site you pointed it to (http://www.sitename.com). The -crawl switch saves each page it finds as a text file with a .dat extension, <em>without</em> the html markup. Just what we want!</p>
<p><em>Note:</em> if lynx isn&#8217;t on your system, you can install it using <a href="http://fink.sourceforge.net/">Fink</a>. Explaining how to do this is beyond the scope of this post, check out the <a href="http://fink.sourceforge.net/doc/index.php?phpLang=en">documentation</a> on the Fink site for more info.</p>
<p>Next step:</p>
<p><code><br />
wc -w *.dat &gt; ~/Desktop/wordcount.txt<br />
</code></p>
<p><strong>wc</strong> is a word count utility. Here we are telling it to count only words (hence the -w switch) in all files with the *.dat extension (in other words, the files that lynx saved in the current directory in the previous step). The results are saved to a file called wordcount.txt on your desktop. Open this file up in a text editor, and you&#8217;re done!</p>
<p>Well, not quite. Web pages in most sites usually have many words in common with other pages in the same site. For example, navigation menus are usually the same throughout the site. It wouldn&#8217;t be fair to count the navigation labels as &#8220;new words&#8221;, because they will only need to be translated once. I usually take a look at a few of the .dat files that lynx created, to guesstimate a percentage of repeated words. (It can be between 10% &#8211; 40% or more of the site content.) I then subtract this number from the total. (I always make it clear that the number I&#8217;m giving is at best a rough estimate. But this is better than nothing!)</p>
<p>Of course, none of these tools are Mac-specific; these things can be done in Linux and even Windows (using <a href="http://www.cygwin.com/">Cygwin</a>). </p>
<p>If you have any Unix web-dev tips to share, or if you know of ways of improving this technique, please let me know.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jarango.com/en/blog/2006/08/08/how-to-run-a-word-count-on-a-website-using-free-unix-tools/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SuSE 8.2</title>
		<link>http://www.jarango.com/en/blog/2003/05/19/suse-82/</link>
		<comments>http://www.jarango.com/en/blog/2003/05/19/suse-82/#comments</comments>
		<pubDate>Mon, 19 May 2003 22:18:55 +0000</pubDate>
		<dc:creator>jarango</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>

		<guid isPermaLink="false">http://www.jarango.com/blog/?p=263</guid>
		<description><![CDATA[SuSE is my favorite Linux distribution. Its installer is very easy to use, and the included application selection is comprehensive and useful. The most recent version, SuSE 8.2 has been very well received, and is widely acknowledged as the most advanced commercial distribution at the moment (more so than Red Hat, the current mindshare champ.)]]></description>
			<content:encoded><![CDATA[<p>SuSE is my favorite Linux distribution. Its installer is very easy to use, and the included application selection is comprehensive and useful. The most recent version, <a href="http://www.suse.com/us/private/products/suse_linux/i386/index.html">SuSE 8.2</a> has been very well received, and is widely acknowledged as the most advanced commercial distribution at the moment (more so than Red Hat, the current mindshare champ.)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jarango.com/en/blog/2003/05/19/suse-82/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Revisiting KDE</title>
		<link>http://www.jarango.com/en/blog/2003/01/13/revisiting-kde/</link>
		<comments>http://www.jarango.com/en/blog/2003/01/13/revisiting-kde/#comments</comments>
		<pubDate>Mon, 13 Jan 2003 23:35:00 +0000</pubDate>
		<dc:creator>jarango</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>
		<category><![CDATA[Random Notes]]></category>

		<guid isPermaLink="false">http://www.jarango.com/blog/?p=151</guid>
		<description><![CDATA[I&#8217;m sending in my Powerbook to have the Firewire port fixed today, so I&#8217;ve gone back to using Windows (office) and Linux (home) for a while. It is an interesting experiment; using KDE again after being on OSX for six months gives me a new appreciation of both systems. Obviously, KDE seems unpolished when compared [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m sending in my Powerbook to have the Firewire port fixed today, so I&#8217;ve gone back to using Windows (office) and Linux (home) for a while. It is an interesting experiment; using KDE again after being on OSX for six months gives me a new appreciation of both systems. Obviously, KDE seems unpolished when compared to Aqua, but it&#8217;s still a very cool desktop. I could probably live full-time in KDE if there was a better Office replacement for Linux. (Maybe one from Microsoft itself?)</p>
<p>I&#8217;ve also been toying around with the new Evolution beta, and it is slick! I&#8217;d really like to see Ximian&#8217;s Exchange connector at work with this, it seems like it would do about 75% of what Outlook does (which seems to be what most enterprise users need anyway). Still, I don&#8217;t think it is as good as Entourage in OSX. (I really miss Applescript!)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jarango.com/en/blog/2003/01/13/revisiting-kde/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Red Hat 8.0</title>
		<link>http://www.jarango.com/en/blog/2002/09/30/red-hat-80/</link>
		<comments>http://www.jarango.com/en/blog/2002/09/30/red-hat-80/#comments</comments>
		<pubDate>Tue, 01 Oct 2002 03:45:06 +0000</pubDate>
		<dc:creator>jarango</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>

		<guid isPermaLink="false">http://www.jarango.com/blog/?p=83</guid>
		<description><![CDATA[Red Hat 8.0 is out. CNet has an article that focuses on the controversy over the changes to Gnome and KDE that are being pushed by RH in this release. Here&#8217;s another interesting review, courtesy of /.]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.redhat.com/about/presscenter/2002/press_eightoh.html">Red Hat 8.0 is out</a>. CNet has <a href="http://news.com.com/2100-1001-960015.html?tag=fd_lede">an article</a> that focuses on the controversy over the changes to Gnome and KDE that are being pushed by RH in this release. Here&#8217;s another <a href="http://osnews.com/story.php?news_id=1842">interesting review</a>, courtesy of /.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jarango.com/en/blog/2002/09/30/red-hat-80/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Bruce Perens Fired</title>
		<link>http://www.jarango.com/en/blog/2002/09/09/bruce-perens-fired/</link>
		<comments>http://www.jarango.com/en/blog/2002/09/09/bruce-perens-fired/#comments</comments>
		<pubDate>Tue, 10 Sep 2002 01:52:26 +0000</pubDate>
		<dc:creator>jarango</dc:creator>
				<category><![CDATA[GNU/Linux]]></category>

		<guid isPermaLink="false">http://www.jarango.com/blog/?p=74</guid>
		<description><![CDATA[Just got back from a short weekend holiday in Boston. Reading the NY Times on the plane, I found an article [registration required] stating that Bruce Perens has been fired from HP due to his support for open-source software and constant criticism of Microsoft. The HP-Compaq merger has made Microsoft a much more important relationship [...]]]></description>
			<content:encoded><![CDATA[<p>Just got back from a short weekend holiday in Boston. Reading the NY Times on the plane, I found an <a href="http://www.nytimes.com/2002/09/09/technology/09SOFT.html">article</a> [registration required] stating that Bruce Perens has been fired from HP due to his support for open-source software and constant criticism of Microsoft. The HP-Compaq merger has made Microsoft a much more important relationship for HP, I can&#8217;t help but wonder if MSFT asked for Mr. Perens&#8217; head. Also noteworthy is the fact that Carly Fiorina has been publicly supportive of GNULinux, and that this seems like a major step back for HP in this regard.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.jarango.com/en/blog/2002/09/09/bruce-perens-fired/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

