<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>silentYak &#187; Unix</title>
	<atom:link href="http://www.silentyak.com/tag/unix/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.silentyak.com</link>
	<description>...a universal platform for global junk...</description>
	<lastBuildDate>Sun, 09 Oct 2011 07:19:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Regular Expressions</title>
		<link>http://www.silentyak.com/2008/10/07/regular-expressions/</link>
		<comments>http://www.silentyak.com/2008/10/07/regular-expressions/#comments</comments>
		<pubDate>Wed, 08 Oct 2008 02:30:46 +0000</pubDate>
		<dc:creator>RRI</dc:creator>
				<category><![CDATA[Daily Rant]]></category>
		<category><![CDATA[awk]]></category>
		<category><![CDATA[grep]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[sed]]></category>
		<category><![CDATA[Unix]]></category>

		<guid isPermaLink="false">http://www.silentyak.com/?p=553</guid>
		<description><![CDATA[Regular expressions (regexes) are one of those concepts that sound innocuous, turn out to be frighteningly complex when you approach them, but aren’t that big a deal when you actually get to know them. The idea behind a regex is quite simple: it is a single concise series of symbols that can be used to [...]]]></description>
			<content:encoded><![CDATA[<p>Regular expressions (regexes) are one of those concepts that sound innocuous, turn out to be frighteningly complex when you approach them, but aren’t that big a deal when you actually get to know them.</p>
<p>The idea behind a regex is quite simple: it is a single concise series of symbols that can be used to represent a class of expressions <em>exactly</em>. For example, a regex could be used to represent “a sequence of characters that begins with the letter C” or “a sequence beginning with b, ending with d and any number of x’s in between.” Regexes are extremely expressive, and can come in handy at odd times.</p>
<p>Regular expressions are built on simple rules. The following is not a comprehensive list, but should provide an idea of what regular expressions look like -</p>
<ol>
<li>Alphabets and numbers represent themselves. So do a large number of punctuation characters. These are <em>case-sensitive.</em></li>
<li>A <em>dot</em> “.” represents a single instance of any character.</li>
<li>An <em>asterisk</em> “*” indicates that the preceding character may be repeated zero or more times.</li>
<li>An <em>plus</em> “+” indicates that the preceding character may be repeated one or more times.</li>
<li>A <em>carot</em> “^” is an anchor for the start of the line.</li>
<li>A <em>dollar</em> “$” is an anchor for the end of the line.</li>
<li>The “&lt;” and “&gt;” symbols are anchors for start and end of a word respectively.</li>
<li>Et cetera.</li>
</ol>
<p>For example, ^Cof*e+$ would match <em>Coffee</em>, <em>Coeeeeee</em> or <em>Coffffffe</em> but not <em>Coffff</em>, <em>coffee</em> or <em>Cofeen</em>. Regexes can be much more complicated in practice, but the basics are sufficient for many common cases.</p>
<p>The most important advantage of understanding regexes is that it opens up the doors to a huge collection of Unix tools, such as <strong><code>grep</code></strong>, <strong><code>sed</code></strong> and <strong><code>awk</code></strong>. Most Unix text editors also support regexes to some degree.</p>
<p>While <strong><code>grep</code></strong> is the most well-known amongst these tools — it is used to find lines that match a given expression — <strong><code>sed</code></strong> aka ‘the stream editor’ is perhaps the most useful, because it can actually manipulate text. For instance, when I migrated more than a hundred old posts into this blog a couple of weeks ago, I needed to replace a whole bunch of <strong><code>&lt;div&gt;</code></strong> tags with <strong><code>&lt;p&gt;</code></strong> tags. That’s when sed came in useful: it took just ten minutes and a single command to get the job done.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.silentyak.com/2008/10/07/regular-expressions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

