<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
>

<channel>
	<title>A mind less ordinary &#187; Insight (semantic filesystem)</title>
	<atom:link href="http://www.dmi.me.uk/blog/tag/insight-semantic-filesystem/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dmi.me.uk/blog</link>
	<description></description>
	<lastBuildDate>Wed, 18 Aug 2010 13:17:48 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/</creativeCommons:license>
		<item>
		<title>A content-based file manager</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/</link>
		<comments>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/#comments</comments>
		<pubDate>Thu, 04 Jun 2009 22:12:38 +0000</pubDate>
		<dc:creator>Dave</dc:creator>
				<category><![CDATA[Coding]]></category>
		<category><![CDATA[Insight (semantic filesystem)]]></category>
		<category><![CDATA[Projects]]></category>
		<category><![CDATA[Braindump]]></category>
		<category><![CDATA[file manager]]></category>
		<category><![CDATA[ideas]]></category>

		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124</guid>
		<description><![CDATA[<p>I&#8217;ve been thinking recently about Insight again, and I&#8217;ve been considering part of the problem with naming and uniqueness.</p>
<p>Names in a traditional file system are made unique based on a full path to the file, but most people think of a file name as just the final component. This would then cause a problem with the <a href="http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/">[...]</a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been thinking recently about Insight again, and I&#8217;ve been considering part of the problem with naming and uniqueness.</p>
<p>Names in a traditional file system are made unique based on a full path to the file, but most people think of a file name as just the final component. This would then cause a problem with the move to Insight, as a file could appear in multiple directories, and its only distinguishing feature would be the final component of its path. This is counter-intuitive and can cause all sorts of problems.</p>
<p>Consider makefiles, for example. They rely on a standard named file (<tt>Makefile</tt>) appearing at various levels in the hierarchy in order to work. Obviously, you would want different makefiles at different levels and in different projects, but Insight as it stands has no way to handle this.</p>
<p>I then started thinking about what makes a file unique. In the end, I came up with two things: name and content. This covers the makefile case (same name, different content) as well as the backup case (same content, different name). It then occurred to me that, in the general case, all you need to distinguish a file is its content, and then actually finding it can all be left up to metadata.</p>
<p><span id="more-124"></span>If files are then thought of as containers for data that happen to have a unique internal identifier (which never needs to be exposed to the user, although it can be accessed as, say, the file&#8217;s inode number) then the idea of a content-based file manager comes into play. These examples work best with visual media, particularly images, but there is no reason in principle that this could not be extended.</p>
<p>Imagine searching for a file. You know it&#8217;s a photo, but you have a large collection of them. With a digital camera and a large-capacity memory card, who needs to ever delete a photo? We&#8217;ll assume that you&#8217;ve dilligently tagged the photos with metadata as you&#8217;ve imported them from the camera, through some easy batch process.</p>
<p>On the tagging point: a lot can be taken from the metadata stored by the camera (date/time, resolution, orientation, black and white/colour, perhaps GPS co-ordinates) and with the right tools, more can be inferred (auto-tagging faces, buildings, perhaps recognising common events like football matches, converting GPS co-ordinates to places, &#8230;). As time goes on, people will need to do less and less manual tagging.</p>
<p>Anyway, back to the file manager. You know you are after a picture or a set of pictures. Normal thought processes will probably follow a path similar to: &#8220;Yeah, I wanted to show dad those <strong>photos</strong> from that <strong>holiday</strong> in <strong>Paris</strong> that we had <strong>two months ago</strong>. I think he&#8217;d particularly like the ones we got of the <strong>Louvre</strong>, as well as the ones <strong>with me in</strong>, of course.&#8221; I&#8217;ve highlighted various key words that can be translated directly to metadata searches. Notice how these all involve a narrowing down of the query.</p>
<p>To convert these to filters, we then have:</p>
<ul>
<li>type: <strong>photo</strong></li>
<li><strong>holiday</strong></li>
<li>location: <strong>Paris</strong></li>
<li>date: <strong>two months ago</strong></li>
<li>at least one of:
<ul>
<li>location: <strong>Louvre</strong></li>
<li>person: <strong>Me</strong></li>
</ul>
</li>
</ul>
<p>This could also be represented by a query:</p>
<blockquote>
<div class="codecolorer-container text twitlight" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">type:photo AND holiday AND location:Paris AND date:-2m<br />
AND (location:Louvre OR person:me)</div></div>
</blockquote>
<p>Breaking it down in this way feels fairly technical and wordy, however. I&#8217;d much prefer a visual view.</p>
<p>Imagine a black field, speckled with points of light representing your photos:</p>
<p><img class="aligncenter size-full wp-image-131" title="Content File Manager 1" src="http://www.dmi.me.uk/blog/wp-content/uploads/2009/06/content-file-manager-1.png" alt="Content File Manager 1" width="450" height="340" /></p>
<p>You filter by &#8220;holiday&#8221;, and (because it learns based on previous searches) it then groups by location. The ones which have been filtered out fade into nothing, and the photos group into labelled blobs and enlarge slightly:</p>
<p><img class="aligncenter size-full wp-image-132" title="Content File Manager 2" src="http://www.dmi.me.uk/blog/wp-content/uploads/2009/06/content-file-manager-2.png" alt="Content File Manager 2" width="450" height="340" /></p>
<p>You filter by date, and as you drag the slider, irrelevant items fade away and relevant ones enlarge:</p>
<p><img class="aligncenter size-full wp-image-133" title="Content File Manager 3" src="http://www.dmi.me.uk/blog/wp-content/uploads/2009/06/content-file-manager-3.png" alt="Content File Manager 3" width="450" height="340" /></p>
<p>Then you add the final filters and set the photos up for viewing, perhaps as a slideshow&#8230; and you&#8217;re done!</p>
<p>Pretty neat, I think.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/</creativeCommons:license>
	</item>
		<item>
		<title>Insight: Where am I now, and where next?</title>
		<link>http://www.dmi.me.uk/blog/2008/06/05/insight-where-am-i-now-and-where-next/</link>
		<comments>http://www.dmi.me.uk/blog/2008/06/05/insight-where-am-i-now-and-where-next/#comments</comments>
		<pubDate>Thu, 05 Jun 2008 23:18:29 +0000</pubDate>
		<dc:creator>Dave</dc:creator>
				<category><![CDATA[Insight (semantic filesystem)]]></category>
		<category><![CDATA[Personal]]></category>
		<category><![CDATA[University]]></category>

		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=55</guid>
		<description><![CDATA[<p>So I&#8217;ve been in Deep Coding Mode™ for quite a while. What have I got to show for it?</p>
<p>Well, the short answer is that Insight is now a functioning file system&#8230; for a given definition of &#8220;functioning&#8221;.</p>
<p></p>
<p>As of this morning:</p>

It can successfully import files from the rest of the hierarchy
Tags (i.e. directories) can be created and <a href="http://www.dmi.me.uk/blog/2008/06/05/insight-where-am-i-now-and-where-next/">[...]</a>]]></description>
			<content:encoded><![CDATA[<p>So I&#8217;ve been in Deep Coding Mode™ for quite a while. What have I got to show for it?</p>
<p>Well, the short answer is that <strong>Insight</strong> is now a functioning file system&#8230; for a given definition of &#8220;functioning&#8221;.</p>
<p><span id="more-55"></span></p>
<p>As of this morning:</p>
<ul>
<li>It can successfully import files from the rest of the hierarchy</li>
<li>Tags (i.e. directories) can be created and removed at any level.</li>
<li>Tags appear and disappear more or less as you expect (i.e. if you already have a tag in your path, it won&#8217;t show up in listings again).</li>
<li>Tags that are synonyms will show up as symbolic links to the actual target tag (but currently do not obey the rule in the point above, i.e. if their target has been used in the path, they may still appear).</li>
</ul>
<p>So far, so good. Now for the limitations:</p>
<ul>
<li>Files cannot currently be opened, read from, written to, or deleted.</li>
<li>Files must be imported in a strange manner: as absolute symbolic links. They then show up as regular files, although they are actually just links to the originals elsewhere in the filesystem.</li>
<li>Tags cannot be assigned to files (or removed from them)</li>
<li>Files can therefore only be imported at the root level</li>
<li>Queries have no effect on file listing, and so listings just show files in limbo</li>
<li>Of course, there is no subcategory union either.</li>
</ul>
<p>But I am working on all of these things. At the moment, the main thing is sorting out the internal inode lists. Once those are done, then it should be quite straightforward to do tag assignment/removal and import directly into tags. Plan of action, therefore:</p>
<ol>
<li>Implement inode insertion/deletion</li>
<li>Implement inode set functions (intersection, union, difference)</li>
<li>Re-implement query tree builder from path. Currently only deals with building a basic conjunctive query tree and assumes that all components are tags. Should:
<ul>
<li> Take a path</li>
<li>Canonicalise it</li>
<li>Check path components (left-to-right) to ensure tags exist</li>
<li>If last part is an incomplete tag, treat appropriately</li>
<li>If last part is a complete tag, then fine</li>
<li>If last part does not resolve as a tag, then hash it and see if it translates to a known inode</li>
<li>If not, or if any tags in path do not exist, then path is invalid</li>
<li>If it is a valid inode, then add QUERY_IS_INODE node to tree</li>
<li>Otherwise return query tree</li>
</ul>
</li>
<li>Implement query processing:
<ul>
<li>Given input set of inodes, produce an output set at each node of the query tree.</li>
<li>In trivial case with top-level <tt>IS_ANY</tt> node, output set is the set of limbo inodes, with internal negation flag set to false</li>
<li>With an <tt>IS</tt> node, the output set is the recursive union of the inodes belonging to that tag and its subtags, with internal negation flag set to false</li>
<li>With an <tt>IS_NOSUB</tt> node, the output set is the set of inodes belonging tag, with internal negation flag set to false</li>
<li>With an <tt>IS_INODE</tt> node, the output set contains a single element: the inode.</li>
<li>With an <tt>IS_NOT</tt> node with a subquery, the output set is identical to the subquery resultset, with an internal negation flag inverted</li>
<li>With an <tt>IS_NOT</tt> node with a tag, the output set is the same as for an IS node, with an internal negation flag set to true</li>
<li>An <tt>AND</tt> node output depends on the negation flags of its subqueries:
<ul>
<li>Both false: output is the set intersection of its subqueries, with negation flag clear</li>
<li>Both true: output is union of subqueries, with negation flag set</li>
<li>Otherwise: output is set difference, with the negation-true set removed from the negation-false set, and the negation flag cleared</li>
</ul>
</li>
<li>An <tt>OR</tt> node output depends on the negation flags of its subqueries:
<ul>
<li>Both false: output is union of subquery results, with negation flag clear</li>
<li>Both true: output is intersection of subquery results, with negation flag set</li>
<li>Otherwise: output is <strong><span style="color: #ff0000;">???</span></strong></li>
</ul>
</li>
<li>Probably very likely to be an error if the negation flag is found to be set at the top level.</li>
<li>Also have to think about how to build a tree from a bracketed expression. But later. Much later.</li>
</ul>
</li>
<li>Output of query processing is an inode set.</li>
<li>Maybe low-overhead query processing just to see if an inode would match the query?</li>
<li>Implement open/read/write as pass-through operations on the inode symlink targets.</li>
<li>Implement symlinking directories as creating synonyms.</li>
<li>Add <strong>LOTS</strong> of checks.</li>
<li>Note: also have to track inode reference count, so that when it gets to zero the inode is added to the limbo list. Once removed from there, it is removed from the filesystem completely.</li>
</ol>
<p>These should be quite straightforward to do (I hope), especially as I know more or less exactly what I&#8217;m doing. Deadlines are closing in, however, and I have a report and presentation and demo to write yet. Hopefully I can get much of this done by Tuesday, then can spend the day doing bits of my report.</p>
<p>I must say that I do love developing this. It&#8217;s just so amazing to be developing a file system and see it work!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.dmi.me.uk/blog/2008/06/05/insight-where-am-i-now-and-where-next/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	<creativeCommons:license>http://creativecommons.org/licenses/by-nc-nd/3.0/</creativeCommons:license>
	</item>
	</channel>
</rss>
