MS Office XML Format Now In TextEdit 86
computerdude33 writes "Apparently, Apple heard of Microsoft Office changing to XML formats. If you have OS X 10.4.2, you can save documents in TextEdit in Word XML Format. They are saved with a *.xml extension, and are riddled with references to Word. Here is an example of one of these documents."
Beating MS... at their own game. (Score:4, Funny)
Re:Beating MS... at their own game. (Score:1)
Re:Beating MS... at their own game. (Score:2, Informative)
Re:Beating MS... at their own game. (Score:1)
And you'll better look for a DRM-cracking thing for Office's new fancy tech [slashdot.org].
Re:Beating MS... at their own game. (Score:2)
OO.Org (Score:1)
Re:OO.Org (Score:4, Informative)
http://www.openoffice.org/issues/show_bug.cgi?id=
in case you're curious... (Score:4, Interesting)
?xml version="1.0" encoding="UTF-8" standalone="yes"?
?mso-application progid="Word.Document"?
w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word
Re:in case you're curious... (Score:1, Troll)
Re:terrible moderation (Score:2, Interesting)
Re:in case you're curious... (Score:1)
Personally, I'll take
Re:in case you're curious... (Score:4, Interesting)
As is evidenced by the lovely pause that happens whenever I close an MSN Messenger window of someone I chat to often, and it appends the chat history to the 1.5Mb XML file, by reading/writing the whole XML file again....wugga wugga wugga.
(Either that, or their append code sucks!)
But other than that, yes. The size argument doesn't stand up - a counter-intuitive result, but seems to be true. Especially when you start zipping XML files.
Re:in case you're curious... (Score:3, Informative)
Multiple indexes can be included, and the last one found is used.
This means that you can actually save, and update a PDF file, by just appending to the end. You can even save the file on a WORM device that allows multiple sessions.
Doing this also maintains a full file history too. You can retrieve any version of the file
Re:in case you're curious... (Score:5, Insightful)
What is your point? Oh lord, this file is 1200 bytes long, for "just two words of text."
I created the same two-word document and saved it in several text-based formats that preserve the formatting. HTML (2700 bytes), RTF (3600 bytes), PDF (16,600 bytes), and of course, Word
The XML version is smaller than all three, and I dare-say, easier to parse and manipulate with a 3rd party program.
Yeah, if you don't want any formatting information stored with your text, use plain text. But otherwise, XML seems to be as good a format as any of the other markup doc formats commonly used in Office.
Re:in case you're curious... (Score:3, Interesting)
Well, sir, you made the point nicely. Although the HTML file that I came up with in vi came in at around 48 bytes. The 33 tags that TextEditor produces for doc-like-XML is actually a pretty compact way of describing a document along with formatting.
Here's my $.02 on the bigger picture here: instead of fighting about document formats with Microsoft, we will now be fighting over XML data structures. Same old bully, just a different playground.
Re:in case you're curious... (Score:1)
Re:in case you're curious... (Score:1)
Re:in case you're curious... (Score:2)
Re:in case you're curious... (Score:1)
Open Office (I think) supports the Oasis format for word processor documents. However, if you're talking about MSWord, the only standard is the one Microsoft defines.
Re:in case you're curious... (Score:2)
Re:in case you're curious... (Score:1, Interesting)
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd ">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
Re:in case you're curious... (Score:3, Insightful)
Re:in case you're curious... (Score:3, Insightful)
It's also a fair example, because Word-HTML can "round-trip" back to Word with no loss in fidelity. A barebones HTML file can not.
Re:in case you're curious... (Score:3, Informative)
Well, that's often the case, but I'm betting you could encapsulate two words in a way that could be transported back to Word (with formatting intact) a lot more efficiently.
A lot of the bulk seems to be Word saving unused style sheets, which arguably doesn't need to be done to keep the document true.
Re:in case you're curious... (Score:2)
Re:in case you're curious... (Score:5, Funny)
Re:in case you're curious... (Score:2)
Who the fuck is Pete?
Re:in case you're curious... (Score:1)
\documentclass[11pt]{article}
\begin{document}
Hot time!
\end{document}
Try it, you'll like it!
Why remove the greater and less than signs? (Score:2)
<?mso-application progid="Word.Document"?>
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word / 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:aml="http://schemas.microsoft.com/aml/2001/c ore" xmlns:wx="http://schemas.microsoft.com/office/word
Who is maintaining the "standard"? (Score:2)
I think of the browser wars. MS loves it that everyone but them are W3C compliant because that ensures they can break all other browsers simply by being incompatible with one standard. Because of their market share, developers will just 'give up' and code CSS, Javascript, and the like as IE compatible. Out of frustration with incompatible websites, u
Re:Who is maintaining the "standard"? (Score:4, Informative)
Re:Who is maintaining the "standard"? (Score:3, Informative)
Re:Who is maintaining the "standard"? (Score:1)
Re:Who is maintaining the "standard"? (Score:1)
Re:Who is maintaining the "standard"? (Score:2)
Re:Who is maintaining the "standard"? (Score:3, Insightful)
Nor was it true that "nobody cared". Lots of people bitched about it.
Re:Who is maintaining the "standard"? (Score:2)
While IE6's poor standards support is a limitation now, it is nothing compared to the pains that Netscape 4 put people through.
Re:Who is maintaining the "standard"? (Score:3, Insightful)
MS didn't achieve browser dominance just through (mis)use of their monopoly. Netscape helped them by releasing NN4.
Re:Who is maintaining the "standard"? (Score:2)
There were certain tags and technologies that (arguably) needed to be made or developed that netscape had to do
Element types. Not "tags".
but there were also W3C standards that Netscape blatantly ignored. For example the CSS standard was made prior to Netscape 4, but Netscape had notoriously poor support for it, while IE had CSS support (albeit very limited) back in version 3.
Get your facts straight. At the time Microsoft were implementing CSS, it wasn't a published W3C recommendation. And Netsc
Re:Who is maintaining the "standard"? (Score:2)
I agree that Netscape should have paid more attention to CSS and other W3C standards once they actually appeared. But that's all kind of beside my point, which was that Netscape never "defined the standard".
"Nobody cared about that then"? (Score:3, Funny)
Re:Who is maintaining the "standard"? (Score:2)
Given the rest of your message, what would this achieve? The only way anything will get better is if a significant number of people push back at stds. non-compliance ... and then it doesn't matter who created/maintains the std.
The obvious place for this to happen is government bodies, and non-US ones are starting to imply they will do this. How much they push back remains to be seen.
Ugly format.. (Score:1)
Would probably be more effecient to use straight XHTML to make documents...
Oasis (Score:1)
XHTML has its place: web.
If you were looking for something witty (and Slashdot-approven) to say, you meant Oasis [oasis-open.org].
Re:Ugly format.. (Score:5, Insightful)
Where's the downside?
Re:Ugly format.. (Score:2)
Re:Ugly format.. (Score:2, Insightful)
"Verbose" perhaps... but verbosity is kind of the whole point of XML in the first place.
I hate MS as much as the next guy, but I'm thrilled with the fact that they are finally creeping towards some open document standards.
When you consider that their main profit strategy for the last 5-10 years has been "force pointless upgrade sales by screwing with the document format and breaking compatability with everybody, including our
Re:Ugly format.. (Score:2)
This is just an additional format.
Re:Ugly format.. (Score:1)
What's more, it is a logical step to use XML, as it is the little brother of the SGML system that dominated documentation for larger companies that could afford development of a SGML system
Re:Ugly format.. (Score:2)
Re:Firefox?? (Score:1)
(followed by the source code)
Re:Firefox?? (Score:2)
Save to disk.
Open in Finder.
Open in TextEdit from Firefox? Please tell me that isn't possible.
Re:Firefox?? (Score:2)
Re:Firefox?? (Score:2)
In the case of Firefox on the Mac, that means it shouldn't trust LaunchServices, because LaunchServices includes any application that wants to handle local content. It should only trust "Library/Internet Plugins", and then when necessary (such as for itms:) add specific cases of LaunchServices e
Re:Firefox?? (Score:2)
.xml? (Score:3, Interesting)
Possibly this was a wrapper for the format to encapsulate images etc? Can anyone who has actually looked at this clarify?
Thanks,
Stuart
Re:.xml? (Score:2)
Holy Riddler, Batman! (Score:2)
Re:Holy Riddler, Batman! (Score:2)
Re:Holy Riddler, Batman! (Score:3, Informative)
<w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/ 2003/2/wordml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:SL="http://schemas.microsoft.com/schemaLibra ry/2003/2/core" xmlns:am
that's gotta be the worst XML ever (Score:1)
any sign of a xmlns attribute anywhere? nope, and yet, they use the ns:tagName notation...
stupid.
has M$ at least released the XML Schemas for the formats? If not, forget it: it's just as illegible as binary...
and let's not forget it'll only display correctly inside MSWord itself...
Re:that's gotta be the worst XML ever (Score:1)
Re:that's gotta be the worst XML ever (Score:1)
Why don't you think that over for a couple minutes. It sort of invalidates your M$ bitch fest.
Re:that's gotta be the worst XML ever (Score:2)
People expect to see a Word icon, and to be able to launch the appropriate application.
Re:that's gotta be the worst XML ever (Score:3, Informative)
Re:that's gotta be the worst XML ever (Score:1)
Influenced by the Open Source Movement (Score:1)
Word XML not necessarily a voluntary move... (Score:5, Informative)
One thing to note is that the Microsoft XML formats and schemas, either those exported by TextEdit or by the .docx format, are not necessarily done by Microsoft by choice. They're not even in response to OpenOffice.org. In my opinion, they are the result of "government forced technology", similar to how the California clean air regulations back in the 70s started to force Detroit to pour more money into catalytic converters and environmentally friendly cars.
There have been numerous government proposals and mandates that require open document formats. Some of the Massachusetts proposals come to mind. I believe the EU also has proposals on the table that require the use of open document formats. The trick with the EU proposal is that it actually mentioned XML (I believe it's the ISIS proposal, but may have the wrong acronym). Governments are large Microsoft customers and Microsoft doesn't want to lose their business. Including the ability to save in publicly documented XML formats gives them a loophole to continue selling to governments, even if all of the open document format requirements are adopted.
The ability of OpenOffice.org (and NeoOffice/J) to support these formats really is dependent on two things. First, the schemas are licensed from Microsoft on non-OSS compatible terms. Each individual person or application has to enter into a licensing agreement with Microsoft individually. This is directly against the terms of either BSD style or GPL style licensing. Secondly, Microsoft may have software patents involved with their schemas according to their licensing terms. While the patentability of a schema itself is questionable, they seem to have several patents revolving around the interpretation of XML schemas that may apply to their Office schemas. This goes against the CDDL style licensing Sun is now fond of.
Because of these terms, the only ways that OOo/NeoOffice could legally support them would be if either the schemas are clean room reverse engineered from example documents or if Microsoft turns a blind eye to open source folk using their schemas. Since I wouldn't want to rely on Microsoft's generosity, the clean room solution is the only way I can see. Sun won't be the one to clean room them either; they don't have to. StarOffice (and Sun built OpenOffice.org for Linux/Solaris/Win) would be covered under Sun's cross-licensing arrangements with Microsoft as a result of their settlement. Those licenses don't extend to non-Sun OOo developers like me, however, so we're all up shit creek.
Just because you can read it and the format is "open" doesn't mean it's "free". You can be sure that Microsoft's lobbyists will make sure that all of those government directives still refer to "open" and no "free" gets snuck in there by mistake.
ed
Re:Word XML not necessarily a voluntary move... (Score:2)
However, IBM has the capability to clean-room reverse engineer a free and open spec. So long as they are pushing a J2EE-centric application strategy opposing .NET, they have every reason to make a freely open implementation available to the rest of the world.
Hope springs eternal ...
Re:Word XML not necessarily a voluntary move... (Score:2)
Interesting... (Score:2)
Re:Interesting... (Score:3, Insightful)
Ah, Pages. The program has some neat features, but has all of the hallmarks of being rushed out of the door for the 1.0 release. It's a nifty program for making flyers, and maybe short newsletters, but it's pretty much a loss to do any serious w
Re:I almost got excited... (Score:1)
Pages (Score:1, Redundant)
The fact that it knew it was a word doc is promising. Looks like Pages will support it too...
Safari does it too (Score:1)
1. This is a part of NSTextEdit class (or whatever its name is) and is not specific to TextEdit.app
2. It's been around a bit longer, at least since 10.3.8, it just wasn't exposed in TextEdit.app
The good thing is that all the Cocoa apps that use this class will also get the ability to handle Word XML docs - for free.