Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Apple Businesses

The Mac, Metadata, and the World 307

Rick Zeman writes: "ArsTechnica has posted yet another compelling article, this time on metadata, its history and the future of metadata storage as seemingly indicated by Apple in OS X. Extensions==Bad!"
This discussion has been archived. No new comments can be posted.

The Mac, Metadata, and the World

Comments Filter:
  • Metadata: A PITA for all real users!
  • The author correalates changing the extension of a filename to the changing of the file size. I often run across files on some macintosh machines that I maintain. If I didn't have the ability to change the file type I wouldn't be able to view or edit the data. Being able to readily change the viewer/filetype lets people read things the computer doesn't know how to handle.
    • No this isn't what the article is stating... The article said that "File Type" can be changed to reflect *accuracy* ie, if your file *is* a JPEG, but for some reason, the meta-data associated with the file tags it as a TIFF, you can change the meta-data to the more accurate JPEG, without affecting the actual encoding of the file itself.

  • Linux? (Score:4, Interesting)

    by interiot ( 50685 ) on Tuesday August 21, 2001 @05:52PM (#2202385) Homepage
    The glaring message I got from this was: Windows implements file type metadata quite badly.

    And the glaring question was: why is Linux blindly following Windows? Linux's file type handling is still in a somewhat early stage, it wouldn't be inconceivable for the paradigm to change.

    • Re:Linux? (Score:2, Interesting)

      by David Roundy ( 34889 )
      Because linux is based on unix, and has blindly followed unix, not windows. Also, linux supports a vast number of file systems, which means that the metadata would have to either be stored on all those file systems, or the OS would have to be able to live without it, which would probably lead to it being ignored by most of the software.

      Unless metadata is implemented consistently, its use can do more harm than good. ("I copied this jpeg (named 'picture') to my windows partition and back again, and now I can't view it!")
    • by alexhmit01 ( 104757 ) on Tuesday August 21, 2001 @06:13PM (#2202471)
      The UNIX file-system is brilliant compared to DOS, but ONLY compared to DOS. It is still designed for command-line users convenience. I am NOT criticizing the command line, I use it daily under OpenBSD, Linux, WinNT4, Win2K, and Mac OS X. It is nice to have the control of a CLI, as well as the ability to run scripts.

      HOWEVER, the system of making things conveniently obvious for the CLI results in engineering decisions that give the OS less flexibilities. GUIs can provide TREMENDOUS ammounts of information BECAUSE the user decides when to get that information.

      For example, the filename and type need easy access for the user. For a GUI user, they need the filename and the type deciding the application binding. For a CLI user, including the type with the filename makes it easier to manipulate.

      While you could setup ls (or dir) with many flags to pick and choose the information, you create a minor mess. Additionally, things like changing the type to a list from a database is one thing for a GUI with a dropdown box, it's a nightmare to implement in a CLI. If you designed for the CLI, you made a tradeoff.

      Additionally, UNIX was developed in a hardware environment more restricted than the DOS world. Early machines used in development are nothing compared to modern machines.

      Take the NTFS file system. If you are on an NT4 machine, or a Win2K machine, (running NTFS of course, not braindead FAT/FAT32) you see filenames as normal. Inside the properties, there are MANY more options. Do it on a Win2K machine, and you see more information than on an NT4 machine if you look closely.

      The UNIX approach is old and dated. Microsoft has moved on, it's important for the UNIX community to do so as well. ACLs (implemented on NT) are FAR more flexible than users/groups. Private user groups are an ugly hack to handle the user/group system. The whole UNIX model needs to be modernized. There are ACL UNIX systems, but they aren't the mainstream.

      I love the power of UNIX-based server, they give me tremendous capabilities. A proper CLI is awesome. But let's not kid ourselves. Beating Win95/Win98/WinME at ANYTHING was never impressive, they were ugly hacks onto DOS that has its roots in the 8086 processor. Everytime people toute the advantages of Linux, they compare it to Win9x. Beating a legacy desktop OS in terms of uptime, etc., is NOT impressive. Compared to Win2K, Linux's technical advantages are pretty minor. There are some, but not many. Compared to the BSDs or commercial UNIXes... well, Linux doesn't look that impressive. It has advantages and drawbacks, different engineering decisions.

      The problem with UNIX is an LCD (lowest common denominator) and designed by committee problem. Having a common API that programmers can target is tremendous, it helps with portability. However, failing to keep moving that API foward is a mistake.

      As it stands there are many applications that only work on one variant. Extending the UNIX common API once or twice a year to encompass vendor extensions would be a tremendous boost, and allow UNIX to escape this trap. If Sun has a great idea and incorporates it into Solaris, their ISVs should take advantage of it. The rest of the UNIX world should have it within a year (or two at most) so ISVs can port to other UNIXes. As it stands, you either write to an old standard OR to a particular UNIX. Neither is a good choice.

      Alex
      • You said:
        As it stands, you either write to an old standard OR to a particular UNIX. Neither is a good choice.

        This isn't nearly the case, if the code IS DONE RIGHT, it can be compiled on all of the common unicies and a large portion of the uncommon ones. I've been reading Kernel Traffic's GNU/HURD report quite a bit recently, and to paraphrase one of the major package porters:

        "Those packages that use autoconf/configure have been amazingly easy to port, usually needing a few lines of editing at most. Those that don't require enormous ammounts of effort."
        • This isn't nearly the case, if the code IS DONE RIGHT [...] "Those packages that use autoconf/configure have been amazingly easy to port, usually needing a few lines of editing at most. Those that don't require enormous ammounts of effort."

          That doesn't really mean autoconf/configure is a magic bullet. If I write something that uses kqueue and want to port it to Linux (or Solaris) I have to write non-kqueue code. Autoconf will merely figure out which part of my code to enable or disable, it won't take my kqueue code and make it work elsewhere.

          So to use the new interfaces, and be portable I have to write code to use the old interfaces. If the old interface doesn't exist I have to disable part of my application's features.

          Using kqueue as an example (once again), if I have an X program using a toolkit that lets me write my own file I/O callbacks and timeouts, but no callback for wait4 or the like, but I want to know when a child process exits, I don't have that many good choices. I can use kqueue which pretty trivially converts a process exit into file I/O (or at least a read ready event, plus a call to kevent rather then read). Then it will only run on two or three Unix systems. I can write a SIGCHLD handler that sets a flag and use a periodic timeout, but then I either burn CPU, or it takes too long to see the event. I could skip the SIGCHLD handler and just call wait4 with W_NOHANG in the timer callback. That has roughly the same problems that the SIGCHLD answer does.

          I could write both, and then use autoconf to decide which to compile. Then I have to test both. The documentation has to say "On some platforms there can be a noticeable gap between the tracks, on others you get no gap in the tuneage".

          In other words it's not that doing the code right helps, it's doing the code twice, and it only helps so much.

          Last comment: yeah, the stuff that already uses autoconf ports easier then the rest of it because someone already did a lot of work to make it run multiple places, and may have decided to ditch features to avoid more problems.

      • by mattdm ( 1931 )
        Ok, first, the linux system of actually lookin' in a file to see what file type it is seems pretty un-idiotic to me.

        But more importantly, I strongly disagree with your point about ACLs. Different priviledge levels might be useful (as opposed to simple user-or-root), but I don't see a good reason to apply this to a filesystem. As it is, it's very easy to see quickly exactly who has what rights to what area -- with complicated ACLs, everything can get confusing and you might not notice a security problem. Sometimes simple is good.

        The private groups notion is far from an "ugly hack" -- in fact, there's no "hack" involved at all: it's just *using* the group and umask functionality in a nice elegant way.
    • Re:Linux? (Score:2, Insightful)

      by Remote ( 140616 )

      I don't think Windows does a bad job at storing data type information. It just doesn't try to. What Windows stores in the filename is file format information. A song tablature, ASCII art and C++ source code are very different things, but you can call them all TXT's and operate on them with no problem at all. The author really messes up things a bit in this matter. You can have, say, two LZW-compressed palleted images. One as a GIF and other as a TIFF. Pretty much the same data type, but with different headers/tags, different LZW max. prefix length, maybe different byte-order. Same for a JPEG TIFF and a JIFF. Actually, what is the point in saying Image/gif one you can't have Sound/gif or Text/gif?

      I really don't think Apple came up with an extensionless filename scheme purely out of conceptual considerations. Anyone who has ever tried to educated someone on how to use a computer for the first time knows that file extension can be confusing! The Mac was built to be easy. I would go as far as to say it was built to reach people who were afraid of computers. The fact is that some other people do need a command prompt, and that interface does benefit from file extensions.

      Now, Linux is not following Windows at all on this. Fire up Konkeror and see how it identifies most files, extension notwithstanding. Or try #man file.

      But, what do I know?

      • Anyone who has ever tried to educated someone on how to use a computer for the first time knows that file extension can be confusing!

        Anyone who has ever had to work with a Mac user knows how confused they get by files with an extension (eg, .DOC) but with no creator and filetype information. They flail on the mouse button and then tell you that your document is damaged or that they can't open your TeachText document and oh by the way why did you do that spreadsheet in TeachText anyway? They simply cannot understand the idea of opening the application and then opening the document.

        Now I know that this can be interpreted as a virtue of the Mac OS because it's allowing you to focus on your "job" and not on your "computer", and maybe it is. But it also strikes me as a little self-defeating because the user doesn't ever get beyond the flail-on-the-mouse stage, which sounds to me like they're not getting much out of their computer.
        • I've run into this many times with the graphics departments at several companies.


          The Mac users have no idea how to deal with PC files which they CAN read.

    • Linux programs discover the type of the file by looking at the file's contents (ref: file(1)). I think this is an obvious and straight-forward way to determine file type, and is therefore not prone to implementation bugs. The mapping of file type to program is handled by application environments like GNOME and KDE. Nautilus determines a file's type and offers a number of programs useful or manipulating the file. I think it works great.
      • Well you say it works for you, so I guess I shouldn't disagree with you. But IMHO it's not a good way to do things. A given program should be able to determine if a given file is of its own type. However, when you're given a file and you have to determine which of a thousand applications it belongs to, that's a whole world of pain. If everyone agreed to put something in the same place in every file, say at the beginning, then it could work. But, it would be ugly, and... Hey, if it's in the same place for every file, why not just take it out of the file and associate it instead? Nautilus may well be able to determine file types, but it's not going to be efficient at it. For example if you throw it a big directory full of stuff it's going to have to scan arbitrary amounts of those files to work out what type they are, and I bet it's not 100% accurate either.
        • For example if you throw it a big directory full of stuff it's going to have to scan arbitrary amounts of those files to work out what type they are, and I bet it's not 100% accurate either.


          So, the metadata implementation suggested somehow frees the interface from scanning a large list of files? It's still gotta build a list of files, and it's still gotta look up metadata for each one. You'd have to have a darned big directory to have any signifigant difference there. Macs make up part of the systems that I admin, and they're certainly no joy to view directories with lots of files within - mostly because of the metadata (the B-Tree thing is pretty cool, though).

          As far as accuracy goes, using magic is pretty accurate, but you're right - it's not 100%. It allows a nice migration from older filesystems without having to touch the filesystem itself, though, and is fairly easy for graphical frontends to implement. Personally, I think the tradeoff in accuracy is a pretty good one, 'cause it allows the CLI to continue functioning cleanly.

          • So, the metadata implementation suggested somehow frees the interface from scanning a large list of files? It's still gotta build a list of files, and it's still gotta look up metadata for each one. You'd have to have a darned big directory to have any signifigant difference there. Macs make up part of the systems that I admin, and they're certainly no joy to view directories with lots of files within - mostly because of the metadata (the B-Tree thing is pretty cool, though).

            I disagree with this point. In a filesystem that stored the file type (I prefer MIME) as part of the filesystem data the additional time required to read in the filetype will be imperceptable to the user. The file data would be read in at the same time as the mtime, permissions, etc. To run file(1) on each file would require open()ing each one, reading in the first hundred bytes or so and comparing it against a large list of magic numbers. This entire operation is merely overhead, if the Content-Type was stored as part of the file data it would already have been gotten by the time the system could even start file() scanning.

            As others have pointed out, BeOS did this right by making this part of the interface instead of like a Mac where the type/creator info is hidden from the user and not editable without downloading additional software.

            • I suppose I'll give you that, as long as it's implemented in a BeOS-like fashion and not MacOS-like. I really like Be's FS. It's a shame that a few key apps are missing for Be (compatible web browser?), or I'd still be using it...


              I don't see the problem in integrating a file-like functionality into the kernel/filesystem and grabbing the data from the file itself instead of having an extra piece of data stored somewhere for each file. The stat() routine gets modified (or renamed to stat2()) to return an extra piece of information, say, a hash code into a magic table. The magic table can easily be stored as a hash to reduce the lookup time to near-imperceptable, and this would still allow [well-written] "legacy" apps to continue functioning properly. Extra possibly out-of-sync file problem avoided, backwards-compatiblility preserved.

    • Actually, I lied. I don't hate MacOS; I just wanted to get your attention by yelling about it. Now that you're here, though, I have to say that I LOVE the MacOS, and have ever since I first used it, before it was even called MacOS. I started with System 7, which was so attractive and easy to use that it's still my bar for measuring other interfaces.

      But if there is one thing I intensely dislike about MacOS, it's the metadata. I know I'm practically alone in the Mac camp, but I hate metadata. I have always thought it was just a space-hogging pain in my ass.

      Now, the space issue is no longer a big concern since we have such big, cheap drives that a little filesystem metadata isn't such a burden on capacity. But back in the days of floppies I was pissed that I could fit so few files on a floppy when my friend with DOS could fit noticeably more. I was especially annoyed that even when I formatted a disk as a PC floppy, the Mac would still waste my space by creating and hiding from me files and folders on the disk to constiture the resource forks. I wanted every kilobyte, which counts when you're cramming a lot of small files onto a lot of small disks.

      But of course this is no longer the big issue it used to be. But if I were storing large numbers of files and running out of space on a Mac, I'd still silently curse all that metadata wasting my capacity.

      The part that still bothers me, now that capacity is no longer a substantial issue, is that in Windows or *nix I can instantly change file types from the interface, but not with Mac. It comes up a lot--many times a day. Click a filename, change three letters, and a text file is recognized as a script or batch file to be executed rather than opened. A click and three letters, and a file I just downloaded from USENET goes from text to UUencoded so that when I double-click it will be decoded for me. A click and three letters is all it takes to change a file's type and its application association from the GUI, without having to resort to some clunky special editor. And it's even better if I need to change the type/association of a great number of files--just open a CLI and type a quick line, and it's all done. What a pain it would be to have to use a metadata editor instead of just manipulating three letters in filenames. Simple file extensions put more power over the file within easy, simple, even automatable reach.

      The advantage of metadata is something many Mac users, and theoretists like this article's author, seem to believe in, but I cannot see it. For instance, it's thought a great advantage that you can set a file to open with any application, despite the filetype. I hate downloading things on a Mac because of this. Some idjit will have a file set to open in an application I don't have, and the computer may be too stupid to know that I always open that file type in Application X. A dialog pops up on any reasonably modern MacOS to help, but it's still a big pain in the ass compared to having a PC automatically know what I open that file type with. Even more annoying is when I really do have the application the file is set to open with installed, but I always want that file type to open in a different app. This most often happens with graphics files--I do not under any circumstances want to have Photoshop or Graphic Converter open a graphics file, just because that's what it was created in. I have a simple image viewer for viewing images. If I want to edit them, *then* I open them in Photoshop. Same for Premiere and others--I do not want a big, slow editor to open my files just because that's what they were created with; we have smaller-footprint and more versatile file viewers for that.

      The other part of it is that the "simplistic" (sometimes the most simple designs are the most elegant, while the more complex are just gaudy) file typing systems also solve the problem of opening certain files of a given type in one application but others of the same type in another application. Metadata proponents always point out how "great" it is to have one, for example, JPEG open in JPEGview or whatever, while another JPEG opens in Photoshop; one .wav opens in a player, while another opens in an editor or burner. Well, I think the solution offered by Windows and by some *nix environments is better, easier, simpler, more elegant. A simple context menu, brought up by right or center-clicking, provides any options you could want. That way to open something in my viewer application, I just double-click--I know on my Windoze box that all image files (except .psd) will automatically open in my viewer, ACDSee (which recently became available for Mac, too)--no surprises, no metadata editors needed. If I want to edit it, I just right-click and choose the command "Edit" from the menu, which is set to open images with Photoshop. Same with .wav and other such--double-clicking opens in WinAMP, right-clicking and choosing "edit" opens in SoundForge. You can create any action, and choose any app to be associated with that action, for each file type--and then a list of all the possible actions for that file type will be displayed when you right-click a given file. But it will open in whatever your set to be your standard viewer, by default, if double-clicked. Much better than relying on hidden metadata. But even better and simpler than having to set up the actions and associations in the Folder Options dialog, is just using the Send To sub-menu that is brought up on right-click--just drop shortcuts to the apps you usually use into the Windows\SendTo folder, and those apps will appear on the Send To submenu when you right-click. That way I can easily open any file with any application, by using only one right-click and one left-click. In terms of launching files, it's like having the flexibility of a CLI, but within the ease-of-use of a GUI. That's one feature the Windows GUI actually got right, and got right very early on. MacOS can keep its metadata, but this is easier, simpler, better. I love the Send To submenu, though it's usually under-utilized by most people.

      I hate to say it, but the metadata folks are IMHO going the wrong way. I want more power and flexibility within my clicks, not less. I hate having to edit metadata when a simple three-letter change is all that would be needed in *nix and 'doze. And as I said, the advantages of metadata in terms of application/file association are entirely negated by the right-click menu and its Send To submenu in Windows, and similar functionality in some *nix GUIs. Metadata may have good uses, but none I can think of that can't be done more simply and elegantly. I also dislike the idea of my filesystem hiding things fom me, which unfortunately is exactly what MacOS does and what the newer NTFS in Win2k and up can do (I believe Ars had an article when Win2k came out about the new NTFS and some of the still-largely-unused metadata fields). Ext2 or FAT32 all the way, baby--and before you poo-poo FAT32, it may have almost no modern features, but it is straightforward, simple, and actually very fast in performance (thanks to the fact that it implements no real modern features); I recall it beating out NTFS in terms of raw speed in an old Ars article. Poor crash recovery is its main weakness.

      I like to keep things as easy to manipulate as possible. And contrary to what many make the mistake of thinking, file extensions are not just easy for CLIs--as I said, it makes sense in a GUI too, since it can be directly manipulated from within the GUI's file browser, without having to open the file in a metadata editor. It also makes the type of file crystal-clear--especially important if you don't want to accidentally run an executable that has an icon to make it look like a file. Unless OS X has some way which I haven't noticed to visually set executables apart from other file types, even when they're on the desktop or somewhere else that doesn't show details, I can't wait for someone to create lots of OS X viruses that have common file icons. That's already a case in the Windows world, where you'll find files called Report.doc.exe that have Word icons, but if you notice the trailing extension you won't mistakenly execute them (though the "show extensions for all file types" option isn't the Windows default anyomore, alas). How can you tell by a glance in OS X, or any other place where metadata rules instead of file extensions?

      Oh well. Windows may not have a lot right--but it does have its use of simple file extensions and simple context menus right. I always hated editing resource forks. It's just another *unnecessary* layer getting between a man and his hardware. Tell me one very useful thing that can be done with filesystem metadata, that can't be done easier and put more in direct control of the user. And before you say "labeling," like MacOS prior to X used to have--that's what folders/directories are for. :-)
      • That's why on every MacOS system I use, I always get this [helsinki.fi]. I cannot live without it. That, combined with this [aone.net.au], solve the problems you describe quite nicely.

        On Mac OS X it's a little different, though. The "Types Change" plugin isn't available (yet?). But the "Open Using" plugin isn't really necessary, since you can force-open any file by dragging to an app in the dock while holding down command and option. Hopefully there will be a way to change the type and creator of a file on X soon, and all will be back to normal.

      • I agree kind of with this. But Mac OS X fixes this. In the file info there is a place for application. Where you can choose a application which opens only that file or opens all files of that type.
      • Mac files do take some extra space, but it is the resource fork that causes it, not the meta data. Since the resource fork is implemented as a separate file, a small Mac file uses 2 disk sections, not one. The type/creator meta data is only 8 bytes.

        The article addresses the common misconception that the resource fork is meta data at some length in http://arstechnica.com/reviews/01q3/metadata/metad ata-6.html ("Second, I mention it because...")

        I agree that handling metadata in MacOS should be easier. Just a simple command to view and edit them would solve most problems. But don't confuse that lack of tools with a fundamental problem with the nmeta data concept itself. And as someone else pointed out, there are rpetty good freeware tools available. to fix this.
      • by Fred Ferrigno ( 122319 ) on Tuesday August 21, 2001 @08:29PM (#2202863)
        This isn't a problem with metadata, just a problem with MacOS' file typing.

        BeOS handled all this very well. Double click to open with the default app. Right click to see a list of every program on your hard drive that opens that kind of file or files like it. (IE, a text editor would show up as an option for an HTML file.) Choose another option and open a dialog to set a file-specific preference.

        I must have said "BeOS did it better" about six times today. I feel like an Amiga user.
      • What a pain it would be to have to use a metadata editor instead of just manipulating three letters in filenames

        tell application Finder
        set creator type of file foo to "8BIM"
        set file type of file foo to "EPSF"
        end tell


        run that in smile and your problems are solved... or, you can just use snitch.

      • "The part that still bothers me, now that capacity is no longer a substantial issue, is that in Windows or *nix I can instantly change file types from the interface, but not with Mac. It comes up a lot--many times a day."


        In the current shipping version of MacOS (read X) you can "Show Info" on a file (same thing as in "Get Info in OS 9) and there is a pull down menu that gives the option application. Its very easily noticed. There you can reset not only the meta data for that file, but also, the default for all files of that type. You no longer need to use file typer. Anyhow in OS 9 this was a non-issue. If you wanted to open something with a different application, all you had to do was drag the file onto the icon for the app you want to launch it. You could even do multiples.


        "I hate downloading things on a Mac because of this. Some idjit will have a file set to open in an application I don't have"


        This is deceitful. If you download a normal everyday file off the internet, odds are not going to get the meta information of that file, even if it was orginally on the Mac. Almost every single application for downloading on the Mac uses the System level settings for mapping extensions to type/creator, which will ALWAYS be an app you have. The only times you will get meta info with the file is when you 1. download a stuffit archive and 2. use hotline. I personally can't remember the last time I downloaded a file as a stuff it archive, unless it was an installer, which in that case it was irrelevent since it was self contained. Now if you are heavily using Hotline for exchanging files, then odds are you are a pirate and deserve what you get.

        • no, I never know whether my porn's gonna open in JpegView, QuickTime Picture Viewer, or Photoshop - and that's all with Jpeg files, with JpegView icons.
      • First, you say we don't need no stinkin metadata, then you say we need it with more power and flexibility.

        I think we can all agree that application binding is a cool thing, and saves us a lot of work as an automatic shortcut to opening documents.

        The problem here is, and NOBODY has gotten this right so far as far as I'm concerned, is having a decent way for power users to edit and manipulate metadata, and configure the OS's treatment of it.

        Yes, the high and mighty programmers have access to it. The hackers have access to it. The grannies don't want or need access to it as long as application binding functions in a basic, and intelligent way. The power users have access to it, the same way the hackers do - but it's often a pain in the ass, using tools that weren't really designed to do anything other than mess around. Nothing useful can be done with these tools ON ANY OS, in terms of allowing a power user to quickly and easily manipulate the metadata to set up a custom behavior that suits his or her purposes.

        And that's really the whole problem.

        That, and of course the fact that filename extensions really have got to go.

        You'd think that the OS vendors would think about this, and provide the users with some nice tools. And I'll agree with you, Microsoft's solution is kind of nice. Where the CM for a Batch File in explorer will give you the choice to RUN the batch file, or Edit it. I think power users need that kind of flexibility for html as well, OPEN the file in a browser, or OPEN the file in an editor.
        I'm often frustrated with graphics files - sometimes I want to run a quick image viewer to display an image file - sometimes I want to tear it up in Photoshop. Launching Photoshop is a 60-second ordeal on some people's machines, and it's a necessary ordeal if you want to do serious editing.
        You see the problem here? Granny needs the file to open in her browser on a double-click. The power-user or content creator needs TWO choices on the execution of an icon. Edit or View. And in the case of executible content, Execute. This needs a user paradigm - probably a lot easier to use than a CM. And it must be MUCH quicker than the stupid "open this unregistered file in one of these " deal, which is annoying and slow on every OS I've seen it.

        when I think about how annoying this problem is, and how NOBODY has ever offered a real, workable solution for this. I see - an opportunity. . .
  • Very interesting. I never really thought about metadata before, but it brings up a lot of points about the mistake of using file extensions.

    File extensions do serve a convenient purpose with a command line, as you can manipulate them easier without using multiple tools. However, if the metadata was stored outside the filename, we could have (and had) UNIX, GNU, BSD, and DOS/Windows utilities to manage them in the past. If all systems were designed to keep track of the metadata, it would have been a better world.

    It is unfortunate that the technical lowest common denominator (DOS and DOS-based OSes) dictate so much of our system. While Windows NT based systems (including Win2K, WinXP, etc.) have made tremendous strides, there is a constant need to maintain compatibility that holds us back.

    I think that it makes sense for Apple to adopt the file extensions, as unpleasant as they are, to support a networked world. The author's suggestion of adding them on transport makes sense, but definitely leaves something to be desired. It would be confusing to transfer Word documents around and have the extensions pop on and off depending on the environment. If the Mac leaves them alone, it still leaves something to be desired because the file name changed when it left the Mac it was created on for the file server, and when it comes back it has a different name.

    It's a shame that a standard for storing the metadata wasn't created long ago. While the PCs wouldn't use the data, it is a shame to lose it. It is also a shame that we have to work towards the lowest common denominator. It's one thing to support it, it is another to adopt the conventions.

    Alex
    • File extensions are wonderful, they are for the HUMAN to understand. They tell the human what is needed in an easy to understand format. Putting .wks on a file tells most people exactly what is needed, and the program itself can figure out the details from the file itself, or the metadata.

      The computer itself shouldn't use the extension for anything but hints for filling in unkowns with default values, according to a the users conventions.
  • I think the reason Apple went to a dual file type system (extensions and metadata) is because it's too hard to implement the necessary level of interoperability otherwise. Suppose you want to keep the usual Classic MacOS method of just having file type/creator code metadata. You merrily store your files on your hard drive, with no extensions in sight.

    Now you share your drive on a heterogenous network. A Win98 box connects, and looks at your files via NFS. Does the MacOS X-side NFS server automatically translate the filenames and add extensions?

    Then another MacOS X user uses ssh to connect to your box. He types 'ls'. Does he see "virtual" extensions or not? What if it's a Windows user telnetting in? How would your box even know what OS the remote user was coming from?

    It's just too easy to run into inconsistencies if you stick with the system of mapping file name extensions. Yes, extensions annoy me, especially since older versions of MacOS are stuck with a 31 character limit (yes, it's 31, not 32--Ars is wrong), and I have to keep file names short to be backwards compatible. Unfortunately, it's just another bad MS decision we have to live with.
  • by Anonymous Coward
    So far, there are at least 3 fallacies in the "Fundamentals" section:

    1) A file's size is not metadata: A file can best be defined as an ordered set of bytes (or bits, or words, or whatever atomic unit your system uses), and the size of that set is intrinsic to it, not external.

    2) A file's modification time is conceptually unrelated to its contents. For example, most systems consider a file "modified" even when its contents are replaced by totally identical contents, and some systems provide means to change a file's contents without changing its modification time. Generally, systems use the modification time to note the time of an action that the user would see as causing a file to be modified, which is not always the same thing as noting the time that a file's content are actually changed. I know of no system that records the later time.

    3) A file's type can change at will, not just to increase or decrease the "accuracy" of the typing. It's rare that a file would be useful when viewed as data of two or more independant data types, but there's nothing intrinsic in the concepts of files, their types, or metadata, to prevent this. Thus, for example, hacker can get some perverse enjoyment from writing source code that works simultaneously in multiple programming languages.

    In general, the author's categorization of metadata into "immutable" and "mutable" is nonsensical. File metadata, by definition, is independent of file data, and is therefore mutable independantly of it. Sometimes systems create tighter links between metadata and data, for example when Photoshop causes files created with it to be of a certain type, or when users makes sure the names of files important to them are in uppercase, but that's a characteristic of the system (Photoshop or user conventions in these examples), not an intrinsic characteristic of data and metadata... And in the introduction, the author warns against reading the "Fundamentals" section with an eye on system implementations :-).

    I'm going to guess the author reaching beyond logic to make this categorization so as to give file typing a role distinct and more important than file naming. Needless to say, this is counter-productive.
  • I suggest to all read here:

    http://www.beosbible.com/exc_filetype.html [beosbible.com]
    and here:
    http://www.beosbible.com/exc_query.html [beosbible.com]

    The BeOS has solved the problem, years ago. The BFS has integrated all these features into the OS itself, so all applications are making use of them. The Byte.com BeOS articles from Scot Hacker are also a must read!

  • Unix pipes. How else are you going to get file type metadata if it isn't in-band. That is what the magic number is all about. Pipes, stdin, stdout, etc.

    I think this is purely an application level problem and and not a filesystem problem.

    It still matters in the gui world too. If we ever develop GUI drag and drop style graphics filters and such, say a webcam output into a filter into something else, that info is still in-band.

    How would you represent the file type of a named pipe, or a socket?
  • In principle I dislike extensions as much as the next man, but Operating Systems everywhere manage to do a repeatedly bad job of managing the resource side of things, yes I'm talking about Window's associations and Mac resource forks (I became quite popular, at an office, some years ago, providing utilities which would strip the first 128 bits out of Mac generated Photoshop and Illustrator files).

    The author of this piece even identifies the horror of allowing OSs to hide the extensions (one of the many things that gets fixed when working on a Windows machine) how could the possibility of allowing two files, in the same folder, to have the same name be acceptable, EVER!.

    If there was a standard, say header, section required by all files this would be fine, but this is obviously OS dependant, remember most of that other metadata, creation date, etc etc is all stored in the FAT on most OSs. A world without extensions means that all file access would need to be pre-processed so that the correct application could subsequently be applied. Opening a file is, last time I checked, more of an overhead than examining an extension. And then what? the application police move in, preventing access to files that haven't been created in the right application?

    I want more metadata about files, I want to get useful, searchable information, perhaps the real place to put it is in the file itself, like so many applications already do. Taking the responsibility away from the filename and putting it in the hands of the operating system, for encoding and decoding this metadata is fine as long as the OS doesn't break, lose the key, and remembers to enforce gatekeeping functions so that when file goes off to play in the big wide world it doesn't drag along any of that OS specific data with it.
  • He's spents a lot of energy attacking Apple's reccomendation that file name extensions should be added to files in addition to individually storing the creator/type in some OS X style fasion. He makes what appears to be a good arguement...For those who didn't read, Ill breifly outline.


    1. funny.txt.vbs emails where .vbs is hidden (as OS X.1 will offer) can trick the user into opening an application


    2. Hidden extensions allow Finder.app and Finder.whatever to appear as "Finder" in the same Folder....


    He then goes on to say why Apple would reccomend developers use extensions (which is redundant)...A networked world demands MacOS be a better "citizen". He claims extensions are unneccisary since email apps can append extensions to files when sent...Not to mention his speculation that Apple would drop it's current model from a Windows model...


    Problem with his analysis. E-mail isn't the only way to share files in OS X. Currently OS X offers FTP, HTTP, Appletalk, NFS, SSH and X.1 will add CIFS. Appletalk handles Meta information transparently, going from Mac to Mac, no need for extensions. FTP, HTTP, SSH and NFS (NFS will almost always go to a flat filesystem) offer no way to store/send OS X style meta information. Yes OS X treats a NFS drive (and CIFS drives if you use Sharity) as a UFS drive and stores Meta data properly so that the Mac can use the file, but the remote computer has NO idea what kinda of file that is, unless it has an extension. So a Mac user who casually copies a extension-less Word document to a PC zip disk, when they put that disk in a PC, it's useless (unless the user knows of the problem). So it is clear file extensions are needed for a networked enviroment....


    But Mac users don't like extensions so Apple will let us hide them.... which creates the problems he described (funny.txt.vbs and duplication file names in the same folder). The first is really a non problem since the Mail application in OS X doesn't hide file extensions even if it's named funny.txt.app and double clicking the in Mail does NOT launch the file. This potential problem can be further alleviated but noting what kind of file it is below it in Mail. The second issue of duplicate file names can be solved easily too...don't allow it. In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser.

    • double clicking the in Mail does NOT launch the file

      Um, yes it does. I just mailed myself FontExamplar.app, and double clicking on it did run it (after telling me it might have a virus and stuff, then I clicked the "What's a virus, please bone me" button and it ran).

      And we know that under Mac OS X.0.4 Mail.app doesn't hide extensions, but I'm not sure that OS X.1's Mail.app won't. I would expect it to follow the finder setting. We also don't know what OS X.1 does with more then one "extension", does it strip them all? None? Or just one? I'm guessing just one, but I'm aware that it is a guess.

    • The second issue of duplicate file names can be solved easily too...don't allow it. In other words DumbName.jpg and DumbName.txt should not be allowed in the same folder. Then hide all the file extensions and the users would be none the wiser.

      Oooh, yeah. Here goes me...

      Create file: BaseClass.cpp
      Create file: BaseClass.h
      That file already exists, choose another name.
      me: WTF?!

      I generally use a source and header directory (file) differentiation, but not when it's a quick and dirty proof of concept test...
    • that's not the worst thing about funny.txt.vbs.

      The WORST thing is that stupid shell-scrap garbage, a feature which nobody ever uses, and which HIDES extensions even if you've configured the OS to explicitly SHOW extensions so you don't get clobbered with this kind of thing. You assume it's a txt file, because you KNOW you told the OS to show you extensions - but not when it's a shell-scrap file, which was an obscure enough feature that even seasoned power users were unaware of it in their day-to-day use of the OS.

      As a viral engineering feat, funny.txt.vbs was genius.
      As an OS feature, shell-scrap, as far as I'm concerned can remove the s-es and become hell-crap.
  • by sllort ( 442574 )
    Since Slashdot was down so long, I actually had a chance to read and understand the article before posting. Perhaps there should be a pause between article posting and allowing comments? Anyway, to get on topic:

    I disagree wholehartedly with the author's assessment that making the file type part of the name is a "bad thing". I disagree with his statement that the type of a file is immutable data. It is not. I have, many times, created a text file, written some html, and renamed it ".html" to load it in a web browser. Using a Mac has always been infuriating to me because I cannot easily change the application it is loaded with. It's changeable, sure, but not as easily as you can change to a simple, easily remembered mnemonic. Linux has echoed this paradigm for good reason. How hard is it to change a bash script to a different shell? Change the first line. On a Mac, this would require you to change an embedded 32 bit identifier.

    The argument is bogus. slashdot.pl and slashdot.txt should NOT collide on my desktop - the type IS part of the name. The mixing of file names & types was neither a hack nor a mistake. To those of us who use computers not as an information appliance but as information builders, the ability to easily manipulate file type data is a way of life.

    Thought provoking article, nonetheless.
    • Re:Wow (Score:2, Insightful)

      by David Roundy ( 34889 )
      I disagree with his statement that the type of a file is immutable data. It is not. I have, many times, created a text file, written some html, and renamed it ".html" to load it in a web browser.

      I'm afraid you misunderstood his definition of immutable. In this example, you changed the data, and what was originally a plain text file became an HTML file. His definition of immutable was that if the file data changed, then its type did not change.

      Also, it didn't mean that the metadata need be unchangable, since it could be changed to reflect greater precision, or if it was wrong in the first place. For example, an html file is a text file (but more). So it is entirely reasonable to change the type from text to html (provided it actually is).

      slashdot.pl and slashdot.txt should NOT collide on my desktop...

      I agree that slashdot.pl and slashdot.txt should not collide, but that is just because they are part of the name. They should also not be required to be a given type.

      How hard is it to change a bash script to a different shell? Change the first line.

      I agree that metadata should be readily accessible. The only reason it is tough on a mac is because it was intended to be difficult, so that new users would have trouble shooting themselves in the foot.

      How would you like it if you had to name all your executable perl scripts ending with .pl? You don't, because the operating system specifies an (optional) header section to every executable file, which allows it to determine which program to run the file with. This is metadata, of the magic number variety. It is data added to the beginning of the file, for the sole purpose of determining its type (ok, in this case it also specifies the path to the perl executable and any flags to be passed it, but ignore that for a moment).

      The reason we have such magic numbers (which are also in most other standard file types, ps, gif, jpeg, etc) is because there are no common operating systems which support file types, so applications are on their own, and are forced to include what is properly (in my opinion) metadata in the file data itself. As long as we are going to store this data, why not have it in a standard location where it can be used by the rest of the operating system?

  • I am not very knowledgeable about this kind of thing, so maybe I am just blowing smoke here, but don't you kind of fall into an infinite loop of metadata after a while? I mean, don't need to have to know things like, say, the size of metadata. Then you have to know the size of the meta-metadata? Then you have to know the size of the metameta-metadata? How do you get around that? (I'm sure there is a simple answer, but I am scratching my head.)

  • by Anonymous Coward
    In a lot of ways this is a pretty good article, but there are a surprising number of instances when the author seems to have bent himself into thinking about a limited number of filesystems. Barring anything academic, experimental, or "fancy," it's pretty clear he's never tried to think about UNIX linked-list style filesystems: within the framework of his discussion, I would assert that a file's name is not part of its essential metadata in a UNIX-style FS. Why? All of the file information is contained in the file's inode and data blocks (the immediate decomposition into metadata and data being obvious). A name for the file is just an entry in a "directory," which is just another file. A given file might be listed in hundreds of different directories, nevertheless, there's still only one file. It might have a different name in each directory it's in; in which case it hardly makes sense to talk about "the filename," unless one is willing to assert that inode + data blocks don't constitute a file, and that each instance of a reference to a particular inode is to be considered a file.

    Furthermore, the examples of "immutable" metadata (ill-considered vocabulary in the first place, I think) are poorly considered. File size can be altered without altering the underlying data on BSD-style unices that provide truncation and extension system calls. Modification time often gets changed on many systems without any change to the underlying data: many, if not most, kernels will change mtime any time a file is opened for write or append even if no subsequent writes are done to the file. "File type" is essentially a nonsense notion on most UNIX filesystems (and DOS too, given the weak representation), a file's type being an interpretation an imaginary multipurpose file handler is to give the data. In such situations, "file type" is decided either by regexp matching of the file name (which can be anything, remember) and judicious use of magic (man magic if you don't get it). In many cases, this doesn't produce an unambiguous answer: 'file blah' produces 'blah: data' with amazing frequency. Arguably this just means that UNIX filesystems don't have an adequate mechanism to express the idea of "file type," but I would argue that regardless, the notion of file type is at least partially bogus. There's nothing to stop me from interpreting data many differnt ways: an XPM is something I can edit with an ordinary text editor, and hence a file of type "text," but it can also define pixmaps, so depending on what I want to do with it, it might be of at least two file types. Similarly, I can try to view a raw audio file as a compiled pixmap, or, to recapitulate the famous joke, 'cat /boot/vmlinuz > /dev/audio'. The results of such voluntary file polymorphism aren't always useful, but they sometimes are.

    There are further aspects of the article which are either incorrect, or at least fail to reflect my personal experience, but for the most part it's simply repetition of previous errors. It seems abundantly clear to me that the author is a thoughtful and well-educated person whose primary computing experience has been with Macs and post-DOS MS machines: and while he may have used UNIX-like operating systems, he doesn't know much about data representation of filesystems on them, and clearly hasn't considered more modern developments like filesystems with journaling or ACLs instead of permission bits.

    Perhaps my criticism is a little too sharp, I would like to emphasize that I liked much of the article and I laud the author for thinking about some important concepts in detail, but I feel the viewpoint adopted is one unnecessarily limited by the author's personal experience.
    • Barring anything academic, experimental, or "fancy," it's pretty clear he's never tried to think about UNIX linked-list style filesystems

      I assure you, that's not the case :)

      within the framework of his discussion, I would assert that a file's name is not part of its essential metadata in a UNIX-style FS. Why? All of the file information is contained in the file's inode and data blocks (the immediate decomposition into metadata and data being obvious). [...] unless one is willing to assert that inode + data blocks don't constitute a file, and that each instance of a reference to a particular inode is to be considered a file.

      But you can't get at the inode without the file's name and location. Inodes are not suitable as file identifiers since they are not guaranteed to be unique across the multiple disks that make up a given file system. The combination of the file name and location is unique in a given file system. "inode + data blocks" do constitute a file, but the file is inaccessible unless the file name and location are known. Therefore the file name is still essential metadata on a Unix-style file system.

      Furthermore, the examples of "immutable" metadata (ill-considered vocabulary in the first place, I think)...

      I considered "data-dependent", but stuck with immutable, for better or for worse.

      ...are poorly considered. File size can be altered without altering the underlying data on BSD-style unices that provide truncation and extension system calls.

      Truncation is a modification of the data.

      Modification time often gets changed on many systems without any change to the underlying data

      See my previous post on the topic. Yes, the semantics of modification date vary wildly. But there's no reason that the semantics I chose in the example in the fundamentals section (which tries to ignore existing implementations) couldn't exist.

      "File type" is essentially a nonsense notion on most UNIX filesystems

      I agree, which is one of the reasons I didn't address the Unix philosophy of reducing everything to a sequence of bytes or blocks at the OS level.

      the notion of file type is at least partially bogus. There's nothing to stop me from interpreting data many differnt ways: an XPM is something I can edit with an ordinary text editor, and hence a file of type "text," but it can also define pixmaps, so depending on what I want to do with it, it might be of at least two file types.

      What you want is a type hierarchy that indicates that XPM is of general type "text" and, more specifically, it is an X pixmap. There's nothing "bogus" about the notion of file type. I think you're unnecessarily constraining yourself to very simple metadata values.

      Similarly, I can try to view a raw audio file as a compiled pixmap, or, to recapitulate the famous joke, 'cat /boot/vmlinuz > /dev/audio'. The results of such voluntary file polymorphism aren't always useful, but they sometimes are.

      Storing file type metadata does not necessarily dictate any OS policies (if any) based on that metadata--something the article tries to point out many times.

      It seems abundantly clear to me that the author is a thoughtful and well-educated person whose primary computing experience has been with Macs and post-DOS MS machines: and while he may have used UNIX-like operating systems, he doesn't know much about data representation of filesystems on them

      I'm not so sure about "well educated." ;-) My primary computing experience is on the Mac and in Unix. I just chose not to address the Unix angle, for various reasons.

      and clearly hasn't considered more modern developments like filesystems with journaling or ACLs instead of permission bits.

      I've certainly "considered" them, and I did mention ACLs (although spelled out instead of by acronym: page 4) in the article. That's all just more, richer metadata.

      • But you can't get at the inode without the file's name and location. Inodes are not suitable as file identifiers since they are not guaranteed to be unique across the multiple disks that make up a given file system. The combination of the file name and location is unique in a given file system. "inode + data blocks" do constitute a file, but the file is inaccessible unless the file name and location are known.

        Actually there have been a number of (frequently ill-considered) non-standard ways to open a file by i-number. Sun's backup co-pilot was the first I had heard of (in '91), but it turns out there were a lot before it, and after. Most allowed only root to do it, but some did not. The ones that didn't broke some of the Unix security semantics.

        Also you can get to a file a few other ways without involving it's name. Like recvmsg, regrettably something else had to know a name to the file at one point for them to work (that name may be gone now though -- all of the names may be gone in fact).

      • I considered "data-dependent", but stuck with immutable, for better or for worse.

        Immutable or data-dependent, they're both inaccurate when discussing file types. Unless you can have a definitive and assuredly correct description of what exactly is in that file, you're bound to be wrong on occasion. As well, I'm not entirely convinced that it's important to have a definitive description of a file's contents; frequently a user will open a file in an app that isn't designed to handle it, intentionally. An OS does well to make changing a file's type as easy as possible, something the MacOS has had trouble with in the past. (Downloading a freeware app for something that should be an OS function is hardly convenient, IMO.)
  • I liked this article, it was thought provoking. It reminded me of the Archimedes, with its 16 bit file types, and the Mac. Oh, the dear Mac. How many times did I scream at it, "yes you will bloody open that file!"... While it sits there all like, "No I bloody will not, it's the wrong type, I'm not even looking at it!"...
    Hang on a minute though. It's a bit much to have a go at Microsoft about file extensions. Unix? Written in C? .c and .h files? What would happen if you didn't have extensions?
    Anyway. Personally I get all excited by the idea of accessing files more as a database action. I know there are people that hate the idea.
    Interestingly, NTFS allows you to hang arbitrary stuff off a file. It's also a good way to hide stuff, because almost no-one knows about it. Oh, well.
  • The thing that cheeses me is that the Internet is based on MIME. When I send a file "foobar" to a Windows user, and my email program tags the file as image/jpeg, the Windows email program should make the file name "foobar.jpg".

    A lot of things would be better if the 'lower' OSes would just pay attention to MIME types. But there's one obvious situation where it falls apart.

    Joe Mac User makes an HTML document referencing a bunch of JPEG and Flash images. The JPEG and Flash files don't have extensions in their names. He sends his HTML directory to his Windows-loving friend. Assuming that the Windows or Mac apps payed attention to the file types (either Just In Time on the Mac to add extensions, or the Windows app payed attention to MIME), the user's documents would have appropriate extensions added to them. The Windows user's HTML is busted.

    While it royally bites that I have to put up with extensions in OS X, I can understand why Apple did this.

    You non-tech-savvy computer user (I'd think that's 80% of computer users out there), are damn clueless, and would be completely unable to fix that HTML example.

    • Before /. went kablooey earlier today, someone pointed out that BeOS used MIME for identifying file types.

      As for the MIME example you give above, it is as much the job of Windows to add (or ignore) the extension to a MIME'd file as it is for Apple to add the proper extension. In other words, I'd say it's the Windows machine's fault for not recognizing the file for what it was, just as much as it was the Apple's fault for not adding the extension. Interoperability requires both sides.
  • Linux thoughts (Score:4, Interesting)

    by iabervon ( 1971 ) on Tuesday August 21, 2001 @06:34PM (#2202542) Homepage Journal
    Linux has traditionally not bothered very much with file type. The user generally knows what to do with the file, and does so. What look like extensions are actually just generically part of the filename; there are conventions for them, but they are no more strict than the conventions for filenames in general (Makefile is probably a makefile, README is plain text, foo.c is C source, etc.).

    An important thing to realize is that file type, like, for instance, size, can be determined from looking at the data. In fact, many programs look at data files and determine the file format from the data; "file" does a pretty good job of detecting non-human-readable formats, even without knowing any information at all about the file type.

    Where this all breaks down, of course, is when the user wants to omit the program name. On a Mac, you normally double-click on a data file to open it (and hope to get a program that does what you want). On *nix, you traditionally have to specify the program-- and much of the time, you select a different program depending on the desired result: for foo.c, I could use emacs, or gcc, or I might want gcc -M (get dependencies), or even wc (to see how big it is), not to mention less or grep or etags.

    I think part of the Mac fascination with file type is due to the monolithic program structure; you find the file, and then you open a single program that does to it anything that you will ever do to it. In this model, there is a right program, and which program is right is based on file type. Windows clearly suffers greatly from having this model but not having a more reliable fashion of determining file type than Linux.

    Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?
    • Re:Linux thoughts (Score:3, Informative)

      by TWR ( 16835 )
      I think part of the Mac fascination with file type is due to the monolithic program structure; you find the file, and then you open a single program that does to it anything that you will ever do to it. In this model, there is a right program, and which program is right is based on file type. Windows clearly suffers greatly from having this model but not having a more reliable fashion of determining file type than Linux.


      You clearly don't understand the type and creator fields.


      There are TWO separate fields for each file in the classic Mac OS. One (TYPE) indicates what kind of file it is. The other (CREATOR) indicates what program will open the file by default. Each is four bytes long.


      The nice thing about this system is that you get a clean separation between file typing AND default launching application. It's other OSes which have the "monolithic" structure you're talking about.



      Incidentally, has anyone else noticed that the MacOS scheme is equivalent to having 4 character extensions which aren't displayed, with the corresponding problem of having malicious executables named README.txt (or even README)?


      First of all, it'd be an 8 character extension. Secondly, List view on a Mac shows file type by default; an application is listed as "application program". Granted icon view won't discriminate unless you do a get info or sort by kind. Finally, if you don't trust the source of a file, don't open the file. This is common sense, no matter what extensions you are showing or whatever file system you are using.


      -jon

    • Re:Linux thoughts (Score:4, Insightful)

      by DickBreath ( 207180 ) on Tuesday August 21, 2001 @11:20PM (#2203055) Homepage
      You don't understand the difference between TYPE and CREATOR. Imagine the following.

      dickbreath@toybox:~/dudes > ls -la
      total 31337
      -rw------- dickbreath users TEXT NPAD file1.txt
      -rw------- dickbreath users TEXT NPAD file2.txt
      -rw------- dickbreath users TEXT WORD file3.txt
      -rw------- john yum JPEG WORD file4.txt
      -rw------- sean yum JPEG GIMP file5.txt
      dickbreath@toybox:~/dudes >

      There are 5 files. Several of them have been MIS-named! Notice that "ls" has been cleverly modified to indicate the file TYPE and CREATOR metadata.

      file1.txt is a text file. (type TEXT) When you doubleclick it, it will open in Notepad. (creator NPAD)

      file3.txt is also text. (type TEXT) But when you double click it, it will open in -- surprise! -- Word!

      file4.txt is not text at all (type JPEG) although the filename might decieve some into thinking it was a text file. But when you've NEVER had to use this stupid ".txt" naming suffix thing, you wouldn't be decieved. In fact, you would wonder why on God's green earth whyone would put ".txt" on the end of a filename? The icon wuold clearly show it is jpeg, belonging to word.

      file5.txt is also not text (type JPEG), but surprise, it opens in a *different* application, this time, the GIMP! (Note type is JPEG, creator is GIMP)

      Finally, the icon displayed for a file is determined by the application. Each application has a database of icons to assign. The icon displayed is determined by the unique COMBINATION of type and creator.

      For instance, if GIMP can open JPEG, GIF, and PSD, then you might have a "family" of similarly styled gimp icons, yet each icon is visually distinct enough to make clear that the file is jpeg, gif, or psd. But another app, such as ImageView, might also have it's own uniquely styled family of similar looking icons, but have "jpeg", "gif", and "psd" variations of those icons.

      When a file is GIF/ImageView, it gets the "gif" icon from the ImageView application. When a file is GIF/GIMP, it gets the "gif" icon from the GIMP application. The icon visually distinguishes what kind of data it is, and what application is going to open it.

      But you can always grab a GIF/ImageView, file and drag-drop it onto GIMP. No sweat. In fact, if you then save the document from GIMP, the creator will be changed -- but type will still be GIF.

      I apologize, if I come off as frustrated that such an advanced concept, invented such a long time ago, is still so relatively unknown by so many people who are so technically brilliant. And a lot of it is entrenched thinking. "Well, this is how we've always done it!" We laugh at MS for lack of innovation, yet I hear many here talk about not liking GUI's despite their now finally commonly accepted advantages, yet some of us stay stuck in the stone ages when it comes to how unix has always done things.

      Finally, other posters under this topic have complained about how hard it is to change the filetype compared to the filename. Really? They type "mv" to change Finally, other posters under this topic have complained about how hard it is to change the filetype compared to the filename. Really? They type "mv" to change the name, and "chown" and "chmod", but they can't change the filetype or creator? You have to (in KDE) right click, Properties to change the filename. Would it be so hard in the same dialog to edit the type and creator as well as the filename?

      I bet the same programming genius who could modify "ls" to display the filesystem's type/creator could also write new "chtype" and "chcrtr" commands.
      the name, and "chown" and "chmod", but they can't change the filetype or creator? You have to (in KDE) right click, Properties to change the filename. Would it be so hard in the same dialog to edit the type and creator as well as the filename?

      I bet the same programming genius who could modify "ls" to display the filesystem's type/creator could also write new "chtype" and "chcrtr" commands.
      • name.c and name.h are two text files that may have been created by vi, emacs, gnotepad, KDE's advanced editor, vim, la la la, the list is very long and includes automagic code generators with yet undetermined names. Why bother trying to store this info, when it's so obvious from the name extention? Oh yeah, that's the way we've always done it, so I must be stupid.

        Sometimes I want vim, sometimes I want gnotepad, sometimes I want something else. I never want Word, and I don't want some stupid meta data setter telling me I do. No, thank you, DickBreath.

        • Like I said, I never want Word. It would really upset me to Work on a project with some dickbreath who used word, if his modifications would make name.c open up that way from seeing the brilliant created with metadata.
        • I never want Word

          Niether do I. But it was the first example to come to mind. Its beside the point.

          Normal apps, i.e. emacs, vi, etc. aren't giong to set either the type or creator. So everything still works the same. You still type the same commands, just as always. If the type/creator is not set, extensions cuold always be used.

          Why bother trying to store this info, when it's so obvious from the name extention? Oh yeah, that's the way we've always done it, so I must be stupid.

          So why should I use this "automobile" thing when the horse and buggy is highly developed?

          And I suppose that Linux users will just always have to put with a second-rate end user experience, because of what you want.

          Its obvious to anyone that if man were meant to fly, he'd have wings. And its obvious that the earth is flat.
      • Re:Linux thoughts (Score:3, Interesting)

        by iabervon ( 1971 )
        My point was that, under Linux, "creator" isn't very useful. Most of my files are created by "emacs" (or "cp" or "sed" or something, when I copy a template), but what I normally do with them is compile them. The filename extension only matters a bit (saves having to tell the compiler what language it is explicitly); having a type code would have the same effect, but having the creator wouldn't help at all.

        MacOS and Windows are designed such that you tend to use the same program to deal with a given file, no matter what you're doing with it. If you have a JPEG you made with the GIMP, you'll view it in the GIMP. If you're going to compile a source file, you edit it in the compile's IDE, and you view it in the IDE. *nix is designed such that you use different programs for different operations (edit, view, compile, render, etc), and use the same program for a given operation for a number of different file types (C, HTML, English text, etc).

        Of course, Windows gets the worst of both worlds-- you have monolithic applications which do everything to a given program, but you don't have the creator metadata, so it picks a program badly.

        I think it would be nice to have file types under Linux; currently, there are a number of partial solutions: emacs has an "Edit this file in -mode" directive, most binary types have magic numbers (e.g., GIF89a, , ^?ELF, JFIF, etc), and some programs look at some extensions. Of course, there would have to be a number of different types associated with a given file (Java source, UTF-8, plain text, etc), and it would have to be simple to specify the type of a new file when you create it, which is currently done partially by naming it in accordance with a convention and partially by putting in data which looks like a certain type, both of which you'd want to do anyway.
        • One solution is what OS/2 did. It had EAs in the filesystem, and there were various actions one could do on a particular object.

          On a .c file, for instance, you would have a "C Source Code" filetype defined in the EA, and a right click in the GUI would display a number of possible actions, each of which may well invoke a different program.

  • if I understood the article, file types are bad because they get in the way of allowing the user to determine how to open and view files. The only real reason to want file types is closely related to application binding, IMHO - some users want *all* html files to open in Frontpage, others want to pick and choose on a per-file basis, most want something in between.

    But then why even *have* file types? You can survive quite nicely without them if you do have application binding metadata. Whenever you use an app to create a file, that file shoudl be bound to that app. If you want to subsequently open that file in a differrent app, then you shoudl let the app try. It's up to the APP, ot the filesystem, if it can open it or not. Why shoudln't you be able to open a JPG in notepad? if notepad has a hex viewing capability, it shoudl open just fine.

    a well-designed app shoudl let the user attempt to open any file. It shoudl try and interpret the data correctly, and it should allow the user to bind the file to the app if they so choose.

    IMHO the whole notion of file types is a mistake - the Mac approach seems to be, incorporate type as metadata, the windows approach seems to be use an extension. But neither is really necessary.

    as a final note - dumping file types avoids the "identical icon" problem that the author demonstrated in the screenshot. Simply use the icon for teh file that corresponds to the *binding* , not the file type.
    • a well-designed app shoudl let the user attempt to open any file. It shoudl try and interpret the data correctly

      Isn't this the whole point though? Data is (are?) just data. It doesn't mean anything unless you know how to interpret it. It's like DNA. DNA is just a bunch of data. It doesn't contain anything saying "I'm DNA" or "Read me like this". That information is external to the data itself (themselves? :)
      Even on top of that, once you get to the file data, often there are multiple subtypes within it. For example, RIFF files are composed of chunks, each containing different types of data. As long as we consider files to be monolithic, opaque blobs, we're restricting ourselves.
      XML is... A discussion for another day...

      • > It doesn't contain anything
        > saying "I'm DNA" or "Read me like this"

        sure it does - the "control codes" for DNA are embedded in between the genes. There are genes that contain "data" and there are "start", "stop" , and other more complex signaling all built in. DNA is a *bad* model for filesystems because the data and metadata are all in one long stream, but it works because it's a massively parallel system, and carefully and precisely regulated by enzymes (analogous to environment variables)
        • No, the signals you talk about are simply strings of bits, fundamentally. Yes, a certain combination of three consecutive bases means "stop". But that idea is not itself embedded in the DNA. It's held in the DNA reader. That was my whole point. DNA does not contain metadata, it is just data. A "stop" code is just as much data as a "make this amino acid" code. Metadata is by definition something about the data. You could put it in the same "stream" but you don't have to, and it often doesn't make sense to do so.

          • not true - there are also long striongs which do not code for protein but are "attachment" points which are where the transcription enzymes know where to latch on. DNA is processed by eternal readers and those readers look for codes embedded in eth DNA bitstream to decide when to attach and where to attach. Once they have attached, and started processing, THEN start-stop becomes relevant. There may be severall start-stop regions in one long patch. But HOW to process the data is embedded in the data.

            i recommend Stryer for a good biochem text...
    • While AmigaOS used .info files to keep track of things like applications, I did like the way IFF was used as _the_ file format. Sounds, images, they were all was stored in IFF files that kept track of what exactly the file held. Sort of like a bundle.

      Then there was Data Types. The theory was that if an App knew about data types, it didn't need to know how to write a particular format as long as Data Types did know. I liked this idea. If a new format came around, I didn't need to update all my apps (as long as they knew how to use Data Types).

      Xix.
  • From the article:

    Any part of the Mac OS user experience that exactly duplicates the experience on another platform ceases to be a compelling reason to buy a Mac.

    I totally disagree. I had absolutely no interest in Macs until OS X, and the reason I switched was because it acts just like a *nix. I can pull up bash, run emacs, grep, sed, awk, etc. Duplicating the unix experience was a very compelling reason for me to buy a Mac. Naturally, little things like Quicktime, games, and DVD support sweetened the deal. :)

    As far as metadata is concerned, I think that Mr. Siracusa is right. The current unix way of handling metadata sucks. Unfortunately, the future does not lie with the old "Mac Way", which is arguably a good deal more elegant. Steve Jobs knows this, which is why his new OS is based on unix, despite its' occasional warts (like file extensions). Apple has done what it had to do to survive in this new world. I just hope that a lot of the old Mac partisans will stop trying to cling to the past and join us for the ride.

    • From the article: Any part of the Mac OS user experience that exactly duplicates the experience on another platform ceases to be a compelling reason to buy a Mac. I totally disagree. I had absolutely no interest in Macs until OS X, and the reason I switched was because it acts just like a *nix.

      The fact that you can run Unix apps may have removed a reason for you to avoid Mac OS, but it is not a compelling reason to switch in and of itself. If Mac OS X acts "just like Unix", why would you switch to it from Unix? Obviously there was some other compelling reason to switch--something that differentiates it from other OSes that are also Unix or Unix-like. Those differences are what make people switch. Features that are the same merely remove those features form the decision making process.

      P.S.-If you read any of the reader mail from my OS X reviews [mindspring.com], you'd know that I'm really a PC bigot ;-)

      • Good point. However, my infatuation with OS X stems not from any single thing I can point to and say "that's a Mac thing". The compelling reasons to switch were because it:

        • Gave me all my Unix tools. (Unix)
        • Gave me Quicktime, Photoshop, and a bunch of games. (Windows)
        • Gave me DVD support. (Windows)
        • Gave me two mouse buttons and a scroll-wheel. (Just about everything but the MacOS)
        • Gave me a command prompt. (Again, anything but the MacOS)
        In short, it allowed me to take the Unix plunge without losing all of the applications that I previously had in Windows.

        Naturally, after I got into it there were other things that I liked about OS X. I love the Quartz display layer and all of its' PDF goodness. I love Cocoa. I love the elegance of the dock. However, there's very few "Mac" things that I can point to and say, "I like that. Don't throw that out."

        Anyway, calling you a bigot was childish and I'd like to apologize for that. Even if I disagree on a few points, it was an excellent article and I encourage you (and the rest of the Ars crew) to keep up the good work.

  • The author mentions that in CoreServices, two different Finders appear.
    Checking my /System/Library/CoreServices with terminal.app, I can see that one is simple called "Finder", the other is called "Finder.app". Changing my Finder view to "table" I can see that one is a "Application"; the other, a "Classic Application." So there are ways to differentiate the files-- though neither is quite elegant. The extensions are probably necessary for Nextstep compatibility.

    In Windows 95 & and successors, the GUI hides the extensions, and as the author points out, this can cause serious problems with vbs viruses. But what was left unmentioned is that it also is hard on programmers. If you can't tell the difference at a glance between "myclass.h" and "myclass.cpp", it really cramps your coding style...

    Microsoft also hides files that end in ".dll"-- which is a pain if you program libaries. This is somewhat more defensible, but not by much.

    Truth be told, although certain aspects of the Type/Creator code were far more elegant than enaything Windows 9X ever developed (Note to Adobe-- grabbing the .ps extension for Distiller is just plain rude), the immutability of the Creator/Type codes, save for ResEdit, is someaht inconvenient. I remember writing Applescript applications to change these codes en mass. Not exactly user friendly.
    • For the sake of your sanity, it's a good idea to make sure Windows Explorer is configured properly before using a Windows account.

      The folder view options are accessible either by selecting 'Options' from the 'View' menu or by selecting 'Folder Options' from the 'Tools' menu, depending on version. In the 'Advanced Options' section of this dialog, you'll probably want to tell Explorer to:

      • Display the full path in the address and title bar.
      • Show hidden files and folders.
      • Not hide file extensions for known file types.
      • Not hide protected operating system files.

      The exact names of these options vary between versions; I'm reading these off Windows 2000.

      • I've made all those changes-- but in Windows, it seems that the choice lies between a interface that is mildly useful for programming, and an ugly/cumbersome one. The Ars technica article suggests that this is a false tradeoff.
  • These did the trick.

    On HPFS, they were stored as part of the file in the filesystem. You could copy the file to a FAT formatted floppy, however, and the EA's were stored as a separate file, allowing you to keep all attributes, including the long file name.
  • I disagree with the idea that file extensions are a hack, I think that nature of Linux lends itself well to the idea that the file type should be encoded into the name of the file in a human readable form.


    What I do have a problem with is the splintered way that MIME is done in practice. Suppose I want my file type "foo" to be associated with a certain mime type and opened with my fooviewer, I would have to register my application/x-foo in:


    /usr/share/mimelnk/application/x-foo.kdelnk
    /usr/share/applnk/Multimedia/fooviewer.kdelnk
    /usr/share/mime-info/fooviewer.keys
    /usr/share/mime-info/fooviewer.mime
    /etc/mime.types
    /etc/mailcap
    /usr/local/lib/netscape/mime.types /usr/local/lib/netscape/mailcap


    I can't even figure out what the heck Mozilla uses for local MIME types... It apparently isn't any of these, in the version of Mozilla I have. I see it makes some nice XML files for user defined types, but those don't work with plugins.


    Why can't we just standardize on using /etc/mime.types and /etc/mailcap? I mean come on!

    • Mozilla's file type database is a bit broken at the moment, but should end up using mime.types and mailcap under Unix, as earlier versions did. I can't find the Bugzilla number for this at the moment, but it's in there somewhere.

  • Gee, want to have everything MacOS has without modifying the underlying OS to support resource forks?

    1.) make sure apps hide file extensions, preferring, instead, icons
    2.) Hell, UNIX people use extensions like .tar.bz2 to signify bzip2'd tar files, right? Get ready for 8bim.tif. (for anyone curious, 8bim is the creator code for Photoshop docs on a Mac.)

    Really, there's the wonderful, superior data you get on a Mac when dealing with a Photoshop TIFF. "8bim" as the creator code (huh?) and the more sensible "tiff".

    Sure, sounds great. *rolls eyes*

    Sure, feel free to rip me a new one if I didn't use the proper terminology. I mess with ResEdit maybe once a year. :-P

    • Hell, UNIX people use extensions like .tar.bz2 to signify bzip2'd tar files, right? Get ready for 8bim.tif. (for anyone curious, 8bim is the creator code for Photoshop docs on a Mac.)
      These two situations do not compare at all. Multiple extensions like *.tar.bz2 usually mean that within one file is another what I extract from the *.bz2 file is a *.tar. Neither BZ2 or TAR are ownership markers; theyre both types.

      This can happen on the Mac a lot; I have many *.sit.hqx files lying around those are BinHexd (a type of encoding) StuffIt archive files.
  • As usual John Siracusa brings up excellent points. However there are a few places that perhaps he's glossed over or disagrees with that I feel could be important:
    1. John argues that the OS can handle flattening files and creating file extensions when they are written to transports & filesystems that don't support the MacOS metadata properly.

      This relies on the MacOS always having appropriate mappings between filetype/creator codes and those annoying DOS extensions - not something that is always possible. Furthermore in an increasingly networked future it's not always assured that files will pass directly in & out through the OS but rather will likely just as often come & go through alternate transports, all of which would have to all be rewritten to support this. As this enforced-extension functionality is already standard in many applications it seems reasonable to simply codify it there then rewrite everything else, particularly as the creating application will have far more insight into the appropriate extension then the OS could.

    2. John argues that the user should always have control over a file's naming and not the OS, yet acknowledges that renaming-with-extensons will often be required in a networked multi-OS environment.

      Personally I would always prefer any extension-addition be made and clearly communicated when I explicitly create a file and not later when it passes in and out of MacOS-metadata-supporting networks and filesystems. Just as John is appalled at the proposal for hiding these extensions from the user's view I'd be appalled at their being automagically added to my file's names at some later date when they may get moved around or viewed from another OS. At least when I name a file "whiz" and the application insists on creating it as "whiz.bang" I know about it, I don't find out later that my "whiz" is that on some servers and "whiz.bang" on others or it's "whiz" for the other Mac users and "whiz.bang" to the *nix & Wintel folks.

    3. Finally John views the possibility of Apple moving from it's MacOS X HFS+ native filesystem to some other with alarm; I see this as evolution.

      HFS+ is a fine filesystem but it's unique in an increasingly unnecessary way. Other more modern filesystems are being created and if MacOS X is to remain current it needs to keep up and take advantage of these advances. Journaling filesystems are poised to become a standard feature of modern *nix implementations - should Apple lock themselves out of this? Furthermore it's not obvious that new filesystems will necessarily obviate the MacOS-metadata (ReiserFS [ibm.com] seems particularly well poised to eventually incorporate much of this) but preparing for all eventualities seems wise.

    Apple no longer lives in it's own comfortable bubble. It's now a peer OS in world increasingly sophisticated and fast-moving. Having grafted MacOS's strengths onto the Next operating system Apple has now entered the rejuvenated unix environment and needs to compete not only on it's own terms but also on those of the other modern operating systems.

    While it remains important to retain those strengths that have made MacOS such a survivor it's also necessary to not hobble it with dependencies on unnecessary Apple-only limitations. Flexibility is the order of the day and this includes some reasonable level of filesystem versatility. Apple already supports a variety of filesystems now it's come time to allow for the possibility of multiple "native" ones while retaining much of it's vaunted metadata strengths.

  • After using the Mac for 10 years (and the PC alongside... and a lot of Linux as well), and after reading the mentioned article, i must say the Mac way is the best way -- in a closed model. For years, it's been so conveinent: no file extensions, nothing to worry about. Heck, even I add ".txt" to a Photoshop file, it'll still open with Photoshop correctly.

    Add to that Windows.
    Now the Mac has to be aware of Windows files. It's a Mac "control panel" called File Exchange. If there's a file without a type/creator metadata (which the Mac depends on, in part), File Exchange says, "Hey, that's a Windows file that ends in .psd. That's really a Photoshop file!" So, it opens then with Photoshop. Alright sir, 'nuff said.

    Add to that networking/internet.

    Now the Mac not only has to worry about file extensions, but also its forks (data fork / resource fork). So if I send a program over email -- even to another Mac -- the result will be garbled data that won't work. I have to first convert the Mac file to MacBinary -- which squooshes together the forks. On the other side, it can be uncompressed into a two-fork program again and it works perfectly.

    Eh, sorta annoying, but I compress things I email anyway because I hate emailing huge files.

    Mac OS 9 gives me no problems. Files work right over PC networks, etc. Mac OS X works even better over networks -- in fact, in my work it is much smarter and works more efficiently than Windows NT (or even Linux).

    The problem? Now Mac users have to worry about file endings --- in a sense. Applications use the Bundle methodology. Works great!

    Files, on the other hand, *sometimes* need extensions.

    When don't they?
    If the file is opening in a "Classic" application (meaning it is being run through the old mac os 9 codebase... though it's not really "emulated"). Because those "old" files HAVE type/creator codes the Mac understands.

    When DO they?
    If it's purely a Mac OS X file, for the most part. Now in 10.1, the file endings can be hidden. But that doesn't solve the real problem: the Mac is battling PC/Unix files from the net AND its original OS 9-and-lower files that now have to carry redundent metadata.

    Apple really needs to solve this. I know a lot of the OS X programmers, and they're extremely committed and bright, so I'm sure the problem will be fixed. Most importantly: make the Mac work great over the 'net (which is does really so far), AND make the experience very easy. I HATE file extensions. I love the old type/creator method. But I'm sure it could be done even better to satisfy all.
  • A few articles at The Register [theregister.co.uk] have talked about how MS will be "building SQL Server into the OS - effectively making the file system a relational database." This will greatly improve efficiency, and concentrate them into one database format instead of a mixture. And its even probably legal.

    • A few articles at The Register [theregister.co.uk] have talked about how MS will be "building SQL Server into the OS - effectively making the file system a relational database." This will greatly improve efficiency, and concentrate them into one database format instead of a mixture. And its even probably legal.


      Words fail me here. There are many beautiful reasons why a relational model makes perfect sense for a file system, and, for that matter, why the notion of "file" itself might be changed for the better in such a system. And, if there is any advantage to monopoly, it is that a tyrant with good ideas can actually make them work.


      The problem is, of course, that the good idea is the notion of a relational file system. SQL is, was, and forever shall be a disgusting hack. Everybody who does research in databases knows this, and almost everybody who works with SQL on non-trivial things knows this, too. And MS knows this as well. Indeed, they could, if they wanted, implement the relational model in a nice, clean way and support it with a query language that's less Cobol-esque and they could make everybody use it.
      The problem is, though, that this doesn't maximize market share by ripping apart its competitors like kleenex pinatas. To do that, you have to first embrace, then extend, then spend a decade to clean up the mess you made in the first place. In other words, SQL Server in the OS is the quintessential Microsoft move. But, hey, at least it does solve the file associations problem in a slightly less dorky way...

  • Just wanted to point out that under the Mac system, the fact that you can click on two different files of the same type and end up in a different applications can actually be *tremendously* confusing to a naive user. This capability isn't necessarily something to be proud of.

    Similarly, the windows system, with file extension associations that are essentially a total mystery to the average user is also tremendously confusing. You can install some lame-ass scanner software and have it decide that it owns all the image file types you used to have associated with photoshop. Now, how do you get back to normal?

    The point that I'm making is that doing almost anything "automagically" has the potential of being a source of confusion. UI designers need to think a little bit more about empowering the user rather than just concealing things from them. Obscurity != ease of use.

    (I strongly suspect that hiding file extensions by default was a really bad idea.)

  • Why does the file name have to be there?

    Why are we still using file systems?

    Computers are quite capable of managing huge organized trees of data. Why are we still fighting with bitstreams like this?
  • One thing that bugs me: when a program (e.g. Nautilus) builds a thumbnail for an image file, the thumbnail isn't attached to the file, it is stashed somewhere (e.g. in a hidden directory called ".thumbnails" or something like that). This is a hack.

    It's worse when you have multiple programs that want thumbnails; there isn't a standard yet and you get multiple thumbnails.

    What I really want is some metadata attached to the image file, and the thumbnail in there. Then when you copy or move the file, the thumbnail goes along. And of course we need a standard so all the programs that want thumbnails will all do it the same way.

    steveha
  • Actually, this isn't about "metadata", it's about data attributes. Real metadata describes data, rather than just identifying it. An SQL schema is metadata. An XML DTD is metadata. A file type is an attribute.

    The main issues regarding file attributes are "how do you find it", and "what do you do with it when you've found it". UNIX users are used to thinking of these as being tied strongly to file names, but that's not fundamental. Consider Microsoft's Fast Find and Active Directory, for example. Or the way the MacOS handled applications; it didn't matter where they were, because there was a database (the "desktop") that automatically tracked them.

    Much of the complexity associated with UNIX programs involves finding their various parts. This usually involves some combination of path variables, configuration files, command line options, shell scripts, and fragile directory tree structures. Something better is needed.

    A separate issue is whether files should be flat streams of bytes or should have structure. The Mac had both; the "data fork" of a file was a byte stream, and the "resource fork" was a tree of records, rather like the Windows registry. This was a good idea, although it suffered from the fact that the machinery for updating the resource fork was prone to corrupting it, which discouraged its use for dynamic data storage. (Or, as Apple put it, "the Resource Manager is not a database.")

    Many of Apple's better ideas suffered from what Mac developers called the Mess Inside. One major effect of this was that things that required keeping complex data structures consistent didn't work too well. Application bugs could corrupt the desktop or resource forks, and the system-level machinery which processed those data structures didn't check them. ("It's more fun to be pirates" - Steve Jobs.) So data tended to turn to mush, which gave resource forks a bad name. Today, most programs store all the important stuff in flat files.

  • The current problem is that filesystems don't make it easy to store properties of a file. The HFS made a brave attempt by dividing files into content and properties (using data and resource forks), but it still didn't objectify the filesystem. For example, creating children of a particular file necessitated converting the file into a folder and then wondering what the hell you were going to do with the folder properties - were they going to be placed within the folder or within the parent folder.

    A solution would be an object-oriented filesystem, that allows every file to have children without nasty conversions, and implements a simple store for properties (a Berkeley DB file would seem a natural solution).
  • So would data about Metadata be... "Gitadata"?

    I know, but I just had to.

Invest in physics -- own a piece of Dirac!

Working...