<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule"
	>
<channel>
	<title>Comments on: A content-based file manager</title>
	<atom:link href="http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/</link>
	<description></description>
	<lastBuildDate>Thu, 22 Dec 2011 18:53:23 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: Ilya</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-290</link>
		<dc:creator>Ilya</dc:creator>
		<pubDate>Thu, 27 Aug 2009 16:09:42 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-290</guid>
		<description>Hi Dave,

Interesting discussion this. In my opinion though location (whether is visual or structural) is an attribute of the content/context. I suppose what I mean is there are plenty of visual search engines out there (just search for these) but what they lack or where they fail (in my opinion) as its not personalized and the visual data is represented based on some generic formula. This non-personalization for user makes it highly complex and frustrating to use because it takes to long to go through the copious amounts of data/files. Google are coming close with their visual representation of filing their videos (www.youtube.com/warp_speed) but this is only useful for browsing and having a bit of fun. To actually find a video say about american idol, the initial search word describing it would be necessary.

With memories you place their location somewhere in you mind, and the way you recall them is by thinking, on tues the 5th, at 11.30pm, at my house, on the couch, I read on tv that the stock of IBM went up 5%. This triggers something, because you have a linked memory to stocks that you purchased the week earlier, this links to your mate you told you should by this, this links to your wife you told, but she said it was a bad idea, and the list goes on...

So back to the location as being an attribute. For me memories come up from not where i&#039;ve been (or place) but other senses (touch, smell [highly sensitive], sounds, tastes, emotions, etc). Of course these are not possible to replicate via digital means as yet - but what&#039;s interesting is having the ability to create links and connections between these attributes to provide meaningful results for search. I suppose what I am getting at is - say you a particular piece of music is playing in the background, you automatically remember the location, smells, who you&#039;re with, the emotions. Having the ability to replicate the semantic file system to work in this manner, provide a whole lot of more interesting results. So a file located on the system, can be retrieved through the context attributes which you saving - ie, meaningful extracted tag words, image recognition, sound attributes, etc. I know this is a little far out, but in my opinion the true meaning of semantic file storage. As an example, you search &#039;last time I jammed with John&#039; - all the photos of john in it would come up, the music you have in common, the lyrics or documents you created, the emails/IMs you had discussing the jam sessions, the jam session events and future jam sessions scheduled, instruments involved, where you bought the instruments.


Now how do we translate this into digital format and creating a file system, in my opinion, it can not be linear. So in order to reference a file you have multiple attributes reflecting one file - but to store the file itself only needs to have some uniqueness so it doesn&#039;t mix up with another file - so even GUID (or simplified versions of GUID) would work.</description>
		<content:encoded><![CDATA[<p>Hi Dave,</p>
<p>Interesting discussion this. In my opinion though location (whether is visual or structural) is an attribute of the content/context. I suppose what I mean is there are plenty of visual search engines out there (just search for these) but what they lack or where they fail (in my opinion) as its not personalized and the visual data is represented based on some generic formula. This non-personalization for user makes it highly complex and frustrating to use because it takes to long to go through the copious amounts of data/files. Google are coming close with their visual representation of filing their videos (www.youtube.com/warp_speed) but this is only useful for browsing and having a bit of fun. To actually find a video say about american idol, the initial search word describing it would be necessary.</p>
<p>With memories you place their location somewhere in you mind, and the way you recall them is by thinking, on tues the 5th, at 11.30pm, at my house, on the couch, I read on tv that the stock of IBM went up 5%. This triggers something, because you have a linked memory to stocks that you purchased the week earlier, this links to your mate you told you should by this, this links to your wife you told, but she said it was a bad idea, and the list goes on&#8230;</p>
<p>So back to the location as being an attribute. For me memories come up from not where i&#8217;ve been (or place) but other senses (touch, smell [highly sensitive], sounds, tastes, emotions, etc). Of course these are not possible to replicate via digital means as yet &#8211; but what&#8217;s interesting is having the ability to create links and connections between these attributes to provide meaningful results for search. I suppose what I am getting at is &#8211; say you a particular piece of music is playing in the background, you automatically remember the location, smells, who you&#8217;re with, the emotions. Having the ability to replicate the semantic file system to work in this manner, provide a whole lot of more interesting results. So a file located on the system, can be retrieved through the context attributes which you saving &#8211; ie, meaningful extracted tag words, image recognition, sound attributes, etc. I know this is a little far out, but in my opinion the true meaning of semantic file storage. As an example, you search &#8216;last time I jammed with John&#8217; &#8211; all the photos of john in it would come up, the music you have in common, the lyrics or documents you created, the emails/IMs you had discussing the jam sessions, the jam session events and future jam sessions scheduled, instruments involved, where you bought the instruments.</p>
<p>Now how do we translate this into digital format and creating a file system, in my opinion, it can not be linear. So in order to reference a file you have multiple attributes reflecting one file &#8211; but to store the file itself only needs to have some uniqueness so it doesn&#8217;t mix up with another file &#8211; so even GUID (or simplified versions of GUID) would work.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-289</link>
		<dc:creator>Dave</dc:creator>
		<pubDate>Tue, 25 Aug 2009 17:39:01 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-289</guid>
		<description>Hi Ilya,

The problem is in defining uniqueness in a way that&#039;s easy for the user to understand. Internally we can just store a number that uniquely references the file just like most, if not all, other file systems.

I think the uniqueness would have to be context-related, as the way you would tell files apart would vary depending on what attributes they share and what you&#039;ve already filtered on. Not all attributes are necessarily useful for uniquely identifying a file, but there needs to be something other than a name in this system.

As far as my own mind goes, I generally identify memories by a few key things: date, location, and people involved (if any). Something that I was doing at the time may also help identify things. This does of course rely on a few assumptions that aren&#039;t necessarily true on a computer system.

A file may have a creation and last modified date associated with it, which would help for uniqueness. Location may help in two ways: location represented by the data, and also a visual/conceptual location of the data. People involved could also be represented in two ways: as owner/collaborator or as people represented by the data itself.

I feel I should explain &quot;visual/conceptual location of the data&quot; a little further. Visual location of the data could be likened to having a preferred layout for the icons on your desktop. After a little while, you know where something is by its position (and then its icon when you get closer), without having to read the labels. An example of a (not necessarily explicit) thought: &quot;Office applications are up in the top right... then that&#039;s the word processor icon...&quot;

There is also conceptual location with traditional file systems, which the language we use reveals. &quot;Oh, that file&#039;s in folder FOO which is in either project XYZ or ABC&quot; could represent a file &quot;within&quot;

&lt;code&gt;/projects/xyz/foo&lt;/code&gt;

or

&lt;code&gt;/projects/abc/foo&lt;/code&gt;

.</description>
		<content:encoded><![CDATA[<p>Hi Ilya,</p>
<p>The problem is in defining uniqueness in a way that&#8217;s easy for the user to understand. Internally we can just store a number that uniquely references the file just like most, if not all, other file systems.</p>
<p>I think the uniqueness would have to be context-related, as the way you would tell files apart would vary depending on what attributes they share and what you&#8217;ve already filtered on. Not all attributes are necessarily useful for uniquely identifying a file, but there needs to be something other than a name in this system.</p>
<p>As far as my own mind goes, I generally identify memories by a few key things: date, location, and people involved (if any). Something that I was doing at the time may also help identify things. This does of course rely on a few assumptions that aren&#8217;t necessarily true on a computer system.</p>
<p>A file may have a creation and last modified date associated with it, which would help for uniqueness. Location may help in two ways: location represented by the data, and also a visual/conceptual location of the data. People involved could also be represented in two ways: as owner/collaborator or as people represented by the data itself.</p>
<p>I feel I should explain &#8220;visual/conceptual location of the data&#8221; a little further. Visual location of the data could be likened to having a preferred layout for the icons on your desktop. After a little while, you know where something is by its position (and then its icon when you get closer), without having to read the labels. An example of a (not necessarily explicit) thought: &#8220;Office applications are up in the top right&#8230; then that&#8217;s the word processor icon&#8230;&#8221;</p>
<p>There is also conceptual location with traditional file systems, which the language we use reveals. &#8220;Oh, that file&#8217;s in folder FOO which is in either project XYZ or ABC&#8221; could represent a file &#8220;within&#8221;</p>
<div class="codecolorer-container text twitlight" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">/projects/xyz/foo</div></div>
<p>or</p>
<div class="codecolorer-container text twitlight" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;width:435px;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">/projects/abc/foo</div></div>
<p>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ilya</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-288</link>
		<dc:creator>Ilya</dc:creator>
		<pubDate>Tue, 25 Aug 2009 13:23:58 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-288</guid>
		<description>Hi guys,

I was having a think about this - and thought whether the file system can mimic in someway how we file our own memories in our minds. This is highly conceptual I know, but if you need uniqueness for a filesystem - why not simply use a unique id for the user of the file (creator/owner) and the timestamp created, plus maybe the duration/size of file or final timestamp [closed file]... I suppose I came up with this, is when you recollect a memory, its based on context (emotion, content, people, interaction, senses) and the time it happened for you. So why not try to replicate this?

Technically I&#039;m not clear on how this would work exactly, but as I see you&#039;re trying to merge the concept of uniqueness and ability to have context. Uniqueness would be the ownerid &amp; timestamp. 

The problems as I understand arise with the fact that if files are shared (which memories are not entirely, except explicitly), that files can be owned by more than one person. This is where you can become creative, either by creating clones of files (with new owners id and timestamp) - or at least creating links/references of new owner id and timestamps, but the original context stays the same (with version control of the context for different users). Aha what happens if the new owners wants to share the file with the originator? well that means that the new owner would need to provide permission! 

Now we get into types of permissions of files - implicit (owner), explicit (editable), information (read-only), data (tags/references)... this is the basis of knowledge management.

Anyway I hope some of that made sense... :)</description>
		<content:encoded><![CDATA[<p>Hi guys,</p>
<p>I was having a think about this &#8211; and thought whether the file system can mimic in someway how we file our own memories in our minds. This is highly conceptual I know, but if you need uniqueness for a filesystem &#8211; why not simply use a unique id for the user of the file (creator/owner) and the timestamp created, plus maybe the duration/size of file or final timestamp [closed file]&#8230; I suppose I came up with this, is when you recollect a memory, its based on context (emotion, content, people, interaction, senses) and the time it happened for you. So why not try to replicate this?</p>
<p>Technically I&#8217;m not clear on how this would work exactly, but as I see you&#8217;re trying to merge the concept of uniqueness and ability to have context. Uniqueness would be the ownerid &amp; timestamp. </p>
<p>The problems as I understand arise with the fact that if files are shared (which memories are not entirely, except explicitly), that files can be owned by more than one person. This is where you can become creative, either by creating clones of files (with new owners id and timestamp) &#8211; or at least creating links/references of new owner id and timestamps, but the original context stays the same (with version control of the context for different users). Aha what happens if the new owners wants to share the file with the originator? well that means that the new owner would need to provide permission! </p>
<p>Now we get into types of permissions of files &#8211; implicit (owner), explicit (editable), information (read-only), data (tags/references)&#8230; this is the basis of knowledge management.</p>
<p>Anyway I hope some of that made sense&#8230; <img src='http://www.dmi.me.uk/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Dave</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-144</link>
		<dc:creator>Dave</dc:creator>
		<pubDate>Fri, 05 Jun 2009 11:14:10 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-144</guid>
		<description>@Babul: Thanks! I&#039;m glad you like it. I&#039;m not 100% happy with this layout, so I&#039;m going to build my own... maybe this weekend. It is a lot less ugly than before.

@David Durant:

One of the other cool things to do would definitely be auto-generated hierarchies, but that either requires a file to have a file-system-unique basename, or for the basename to be autogenerated along with the hierarchy (which would be my preferred solution: context-sensitive file naming).

The internal identifier for a file is the inode, by definition. The inode number is unchanged during the lifetime of the file. If you were to create a file and hardlink it (e.g. &quot;touch file1; ln file1 file2&quot;) then both directory entries would share an inode number (as can be verified by &quot;ls -i&quot;). This makes it suitable and unique. Last access time since Epoch would not be unique, perhaps even down to nanosecond resolution. It&#039;s perfectly possible to access or modify multiple files in under a second (think &quot;grep foo *&quot; or &quot;sed &#039;s/foo/bar/&#039; *&quot;).

My eventual plan is for this to work at the file system layer -- it will tie directly into an Insight file system. If this was separate (i.e. to work for any file system) then I would probably store it using the same custom structure I use for Insight, which is optimised for this use case in a way that SQL/relational databases are not.</description>
		<content:encoded><![CDATA[<p>@Babul: Thanks! I&#8217;m glad you like it. I&#8217;m not 100% happy with this layout, so I&#8217;m going to build my own&#8230; maybe this weekend. It is a lot less ugly than before.</p>
<p>@David Durant:</p>
<p>One of the other cool things to do would definitely be auto-generated hierarchies, but that either requires a file to have a file-system-unique basename, or for the basename to be autogenerated along with the hierarchy (which would be my preferred solution: context-sensitive file naming).</p>
<p>The internal identifier for a file is the inode, by definition. The inode number is unchanged during the lifetime of the file. If you were to create a file and hardlink it (e.g. &#8220;touch file1; ln file1 file2&#8243;) then both directory entries would share an inode number (as can be verified by &#8220;ls -i&#8221;). This makes it suitable and unique. Last access time since Epoch would not be unique, perhaps even down to nanosecond resolution. It&#8217;s perfectly possible to access or modify multiple files in under a second (think &#8220;grep foo *&#8221; or &#8220;sed &#8216;s/foo/bar/&#8217; *&#8221;).</p>
<p>My eventual plan is for this to work at the file system layer &#8212; it will tie directly into an Insight file system. If this was separate (i.e. to work for any file system) then I would probably store it using the same custom structure I use for Insight, which is optimised for this use case in a way that SQL/relational databases are not.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Babul</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-143</link>
		<dc:creator>Babul</dc:creator>
		<pubDate>Fri, 05 Jun 2009 00:43:16 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-143</guid>
		<description>Very interesting article. Also liking the new blog layout, much nicer than before.</description>
		<content:encoded><![CDATA[<p>Very interesting article. Also liking the new blog layout, much nicer than before.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: David Durant</title>
		<link>http://www.dmi.me.uk/blog/2009/06/04/a-content-based-file-manager/comment-page-1/#comment-142</link>
		<dc:creator>David Durant</dc:creator>
		<pubDate>Thu, 04 Jun 2009 23:55:43 +0000</pubDate>
		<guid isPermaLink="false">http://www.dmi.me.uk/blog/?p=124#comment-142</guid>
		<description>I&#039;m almost certainly too old as I still find the idea of visual clouds representing data cluttering and less useful that a hierarchy (although a hierarchy generated on the fly via metadata). No reason not to offer both I suppose. :-)

I agree that if you have a need for the filesystem to have an internal identifier for the file when access is provided by metadata then content is as good a way as any to generate that (although realistically you could just as easily use pretty much anything - for example seconds since Epoc last-accessed). I would suggest perhaps a hash rather than an inode as I&#039;m sure some file systems would change the inode for a file in certain circumstances. However, hashing a large file (or a number of large files) is, of course, potentially computationally expensive.

I&#039;d be interested on your thoughts of making this whole thing work at a low layer. It&#039;s my (possibly incorrect) assumption tha this is still a (SQLite?) database based system sitting on top of an existing filesystem? I wonder if there is a way to eliminate the whole file system later and work down at that level...?</description>
		<content:encoded><![CDATA[<p>I&#8217;m almost certainly too old as I still find the idea of visual clouds representing data cluttering and less useful that a hierarchy (although a hierarchy generated on the fly via metadata). No reason not to offer both I suppose. <img src='http://www.dmi.me.uk/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>I agree that if you have a need for the filesystem to have an internal identifier for the file when access is provided by metadata then content is as good a way as any to generate that (although realistically you could just as easily use pretty much anything &#8211; for example seconds since Epoc last-accessed). I would suggest perhaps a hash rather than an inode as I&#8217;m sure some file systems would change the inode for a file in certain circumstances. However, hashing a large file (or a number of large files) is, of course, potentially computationally expensive.</p>
<p>I&#8217;d be interested on your thoughts of making this whole thing work at a low layer. It&#8217;s my (possibly incorrect) assumption tha this is still a (SQLite?) database based system sitting on top of an existing filesystem? I wonder if there is a way to eliminate the whole file system later and work down at that level&#8230;?</p>
]]></content:encoded>
	</item>
</channel>
</rss>

