Archive for December, 2005

* Declarative Data Binding

Posted on December 30th, 2005 by Dave Johnson. Filed under AJAX, Declarative Programming, Web2.0, XML.


I am currently looking into various markup based / declarative languages such as SVG, XForms, MXML/Flex, XUL, XAML and Laszlo. These are all picking up steam to various degrees. I find it hard to imagine a Microsoft future without XAML (or Adobe without Flex for that matter) - though I suppose it is _possible_…

At any rate, the evolving declarative landscape is very interesting particularly since every language is similar to the next - so close in fact that they are really just a quick XSL Transformation away from each other. The question is which technology will be the trail blazer that defines the new landscape? To my way of thinking, the most important part of any of these declarative languages is data binding and that is where the war will be won.

Several of these technologies are W3C standards (SVG, XForms, XBL) while the others are being pushed by various companies / foundations such as Mozilla (XUL), Adobe (MXML/Flex), Microsoft (XAML) and Laszlo (Laszlo).

What I do like quite a bit on the W3C side of things is XBL - it seems pretty well thought out but there are still lingering questions that I need to get sorted. When looking at these different languages the first thing that I examine is how easy it is to create a simple lookup or listbox. So essentially consider some foreign key relationship between to sets of data such as a list of people records. In the user-interface you want to display their list of names while actually binding to the person’s ID. This is fairly easily achieved in XForms from what I have seen (when using select or select1), however, the other languages are a bit more ambiguous on these sorts of issues I find. Here is the snippit from XForms:

<select model=”cone” ref=”my:order”>
<label>Flavors</label>
<itemset model=”flavors” nodeset=”/my:flavors/my:flavor”>

<label ref=”my:description”/>
<value ref=”my:description”/>
</itemset>
</select>

Essentially by using the <value ref=”my:description”/> tag tells the XForms processor the xpath in the data source / model that the selected value should come from - which can be different from the label path.

If anyone can point me to some good examples in any of these languages I would love to see them!

.



* Social Annotation

Posted on December 28th, 2005 by Dave Johnson. Filed under Search, Semantic Web, Tagging, Web2.0.


I have just read about a company currently in private beta called Diigo, which is in the business of social annotation (SA).

Apparently SA is a superset of social bookmarking or tagging, which is of course the piece de resistance of ‘Web 2.0′. The question is can SA be an even better route to getting aquired by MAGY? Don’t quite me on ‘MAGY’ though since I am not sure what order those names should go in…

I had been thinking about SA for some time but did not have the time / resources to get anything together for public showing - but this might be a good reason to do so. Of course given my record with getting code up on my blog I won’t have a sample till next year this time. Anyhow, the possiblities for SA are much more attractive than social bookmarking in my mind. With social annotations (at least what I consider it to be) I can surf to any web page and place tagged sticky notes (private or public) in a browser agnostic fashion that will contain my comments and refer to a certain block in the web page DOM. Then I can go to some central place to view / oranize my comments and can also subscribe to RSS of other people’s comments on those pages or from particular people. The main problem that I have with Diigo (from the looks of their Flash demo) is that I need to install their toolbar - yuck!

The useful part of these systems for end users is that they can tag particular bits of content on a page and find exactly what they were referring to with a tag. Then if you combine this idea with microformats and the Semantic Web you might really be cooking with something combustible like methane.

This brings us to the all important (both dreaded and revered at the same time) question of ‘monetization’ - I guess I have to eat somehow but that is why I have a day job :) In a perfect world I imagine the toolbar from Diigo being essentially a web toolbar (as opposed to browser integrated) that floats over the current page and is inserted by using a bookmarklet in true AJaX fashion . With the toolbar could be relevant ads and there could also be relevant advertisement on the notes themselves. But hey who needs money when you have a few hundred thousand users and ’social tagging/sharing/annotation’ hype to help you implement your ‘Web 2.0′ exit strategy.

.



* Fuel for the Tag Embers

Posted on December 23rd, 2005 by Dave Johnson. Filed under Search, Semantic Web, Tagging, Web2.0.


Om Malik posted about the increasing interest in people power vs the power of Google [1]. I think that tags will lose out to automated clustering (such as Vivisimo) in the short term but that doesn’t mean we will not see more players like Wink trying to get a piece of the tagging pie. Not that I won’t give Wink a chance and don’t get me wrong I do think that services like Wink have a place in the blogosphere today but we already have the likes of Technorati and my new favourite Google Blog Search.

The topic of tag utility has been covered quite a bit in the past by the likes of Tim Bray [2] and Stephen Green [3] (both good canucks) and I am sure it will be discussed well into the future! On the whole I have to agree with Tim, and Stephen brings up some very interesting points from his research that should be considered. I will discuss that in a moment.

But first, there are a few issues that I can see with the new emphasis on the old idea of tagging …

  • People are lazy. who wants to waste their time rating pages when Google does a _pretty good_ job on its own?
  • People who are not lazy (like geeks maybe) cause tagged content to be very skewed to their interest group and therefore it becomes inaccesible to the majority of people.
  • There is lots of meta-data (some may even call it “tags”) available to search engines based on page content - so why do more work?
  • If I tag a page as “interesting” that is only in the context of what I am thinking at that moment in time. Tags can have temporal/geographic/personal dependence which is something that is not easy to manage with tags today.

For example, a current topic that I am very interested in is the science (or maybe art?) of data binding - ie how to create a binding language that provides rich mechanisms for indirection and how to express it using a declarative / mark-up approach. This is something that is quite difficult to find information about using Google or Yahoo!. Could tagging of content help me find some obscure piece of very relevant and useful information on this topic? If someone has found it before me and tagged it with the pecise tags that I would use for the topic then maybe. However, I’m not convinced [4] and it seems that John Battelle is not either [5].

Here is the thing, people need to look beyond the tag - it is a stop-gap that has been tried many times before (web page keywords?). Places that tags have had some success, as Stephen mentions, are instances where you have defined vocabularies or taxonomies. Content is tagged by domain experts and integrated into a taxonomy at great expense but with great reward (this seems to be a re-occuring theme to me). I am not sure that people using the web want to be constrained like this - yet it is the best way to get value from tagging so that everyone “talks the same language”.

This brings me to a point that I have brought up before [4]. Forget tags. Think semantics. Think Semantic Web [6]. The discussion should not be about the value of tags but about moving towards a richer Web. More on that soon.

References
[1] People Power vs Google - Om Malik, Dec 22, 2005
[2] Do Tags Work? - Tim Bray, Mar 4, 2005
[3] Tags, keywords, and inconsistency - Stephen Green, May 13, 2005

[4] More Tags - Dave Johnson, Dec 14, 2005
[5] Will Tagging Work - John Battelle, Dec 4, 2005
[6] Tagging Tags - Dave Johnson, Dec 1, 2005

.



* JSON Benchmarking: Beating a Dead Horse

Posted on December 21st, 2005 by Dave Johnson. Filed under AJAX, JSON, JavaScript, Web2.0, XML, XSLT.


There has been a great discusson over at Quirksmode [1] about the best response format for your data and how to get it into your web application / page. I wish that I had the time to respond to each of the comments individually! It seems that PPK missed out on the option of using XSLT to transform your XML data to an HTML snippit. In the ensuing discussion there were only a few people mentioning XSLT and many of them just to say that it is moot! I have gone over the benefits of XSLT before but I don’t mind going through it once more :) Just so everyone knows, I am looking at the problem _primarily_ from the large client side dataset perspective but will highlight areas where JSON or HTML snippits are also useful. Furthermore, I will show recent results from JSON vs XSLT vs XML DOM in Firefox 1.5 on Windows 2000 and provide the benchmarking code so that everyone can try it themselves (this should be up shortly - just trying to make it readable).

As usual we need to take the “choose the right tool for the job” stance and try to be objective. There are many dimensions that tools may be evaluated on. To determine these dimensions let’s try and think about what our goals are. At the end of the day I want to see scaleable, useable, re-useable and high performance applications developed in as little time and for as little money as possible.

End-User Application Performance
In creating useable and high performance web applications (using AJAX of course) end-users will need to download a little bit of data up front to get the process going (and they will generally have to live with that) before using the application. While using the application there should be as little latency as possible when they edit or create data or interact with the page. To that end, users will likely need to be able to sort or filter large amounts of data in table and tree formats, they will need to be able to create, update and delete data that gets saved to the server and all this has to happen seemlessly. This, particularly the client side sorting and filtering of data, necessitates fast data manipulation on the client. So the first question is then what data format provides the best client side performance for the end-user.

HTML snippits are nice since they can be retieved from the server and inserted into your application instantly - very fast. But you have to ask if this achieves the performance results when you want to sort or filter that same data. You would either have to crawl through the HTML snippit and build some data structure or re-request the data from the server - if you have understanding users who don’t mind the wait or have the server resources and bandwidth of Google then maybe it will work for you. Furthermore, if you need fine grained access to various parts of the application based on the data then HTML snippits are not so great.

JSON can also be good. But as I will show shortly, and have before, it can be slow since the eval() function is slow and looping through your data creating bits of HTML for output is also slow. Sorting and filtering arrays of data in JavaScript can be done fairly easily and quickly (though you still have to loop through your data to create your output HTML) and I will show some benchmarks for this later too.

XML DOM is not great. You almost might as well be using JSON if you ask me. But it can have it’s place which will come up later.

XML + XSLT (XAXT) on the other hand is really quite fast in modern browsers and is a great choice for dealing with loads of data when you need things like conditional formatting and sorting abilities right on the client whithout any additional calls to the server.

System Complexity and Developement
On the other hand, we also have to consider how much more difficult it is to create an application that uses the various data formats as well as how extensible the system is for future development.

HTML snippits don’t really help anyone. They cannot really be used outside of the specific application that they are made for but when coupled with XSLT on the server can be useful.

JSON can be used between many different programming languages (not necessarily natively) and there are plenty of serializers available. Developers can work with JSON fairly easily in this way but it cannot be used with Web Services or SOA.

XML is the preferred data format for Web Services, many programming languages, and many developers. Java and C# have native support to serialize and de-serialize from XML. Importantly on the server, XML data can be typed, which is necessary for languages like Java and C#. Inside the enterprise, XML is the lingua franca and so interoperability and data re-use is maximized, particularly as Service Oriented Architecture begins to get more uptake. XSLT on the server is very fast and has the advantage that it can be used, like XML, in almost any porgamming language including JavaScript. Using XSLT with XML can have problems in some browsers but moving the transformations to ther server is one option, however, this entails more work for the developer.

Other factors

  • The data format should also be considered due to bandwidth concerns that affects user-interface latency. Although many people say that XML is too bloated it can easily be encoded in many cases and becomes far more compact than JSON or HTML snippits.
  • As I mentioned XML can be typed using Schemas, which can come in handy.
  • Human readability of XML also has some advantages.
  • JSON can be accessed across domains by dynamically creating script tags - this is handy for mash-ups.
  • Standards - XML.
  • Since XML is more widely used it is easier to find developers that know it in case you have some staff turnover.
  • Finally, ECMAScript for XML (E4X) is a _very_ good reason to use XML [2]!

Business Cases
There are various business cases for AJAX and I see three areas that differentiate where one data format should be chosen over the other and these are: mash-ups or the public web (JSON can be good), B2B (XML), and internal corporate (XML or JSON). Let’s look at some of the particular cases:

  • if you are building a service only to be consumed by your application in one instance then go ahead and use JSON or HTML (public web)
  • if you need to update various different parts of an application / page based on the data then JSON is good or use XML DOM
  • if you are building a service only to be consumed by JavaScript / AJAX applications then go ahead and use JSON or a server-side XML proxy (mash-up)
  • if you are building a service to be consumed by various clients then you might want to use something that is standard like XML
  • if you are building high performance AJAX applications then use XML and XSLT on the client to reduce server load and latency
  • if your servers can handle it and you don’t need interaction with the data on the client (like sorting, filtering etc) then use the XSLT on the server and send HTML snippits to the browser
  • if you are re-purposing your corporate data to be used in a highly interactive and low latency web based application then you had better use XML as your data message format and XSLT to process the data on the client without having to make calls back to the server - this way if the client does not support XSLT (and you don’t want to use the _very slow_ [3] Google XSLT engine) then you can always make requests back to the server to transform your data.
  • if you want to have an easy time finding developers for your team then use XML
  • if you want to be able to easily serialize and deserialize typed data on the server then use XML
  • if you are developing a product to be used by “regular joe” type developers then XML can even be a stretch

I could go on and on …

Performance Benchmarks
For me, client side performance is one of the biggest reasons that one should stick with XML + XSLT (XAXT) rather than use JSON. I have done some more benchmarking on the recent release of Firefox 1.5 and it looks like the XSLT engine in FF has improved a bit (or JavaScript became worse).

The tests assume that I am retrieving some data from the server which is returned either as JSON or XML. In XML I can use the responseXML property of the XMLHTTPRequest object to get an XML object which can subsequently be transformed using a cached stylesheet to generate some HTML - I only time the actual transformation since the XSLT object is a singleton (ie loaded once globally at application start) and the responseXML property should have little effect different from the responseText property. Alternatively the JSON string can be accessed using the responseText property of the XMLHTTPRequest object. For JSON I measure the amount of time it takes to call the eval() function on the JSON string as well as the time it takes to build the output HTML snippit. So in both cases we start with the raw output (in either text or XML DOM) from the XMLHTTP and I measure the parts needed to get from there to a formatted HTML snippit.

Here is the code for the testJson function:
function testJson(records)
{
//build a test string of JSON text with given number of records

var json = buildJson(records);
var t = [];
for (var i=0; i<tests ; i++)
{
var startTime = new Date().getTime();
//eval the JSON string to instantiate it

var obj = eval(json);
//build the output HTML based on the JSON object
buildJsonHtml(obj);
t.push(new Date().getTime() - startTime);
}
done(’JSON EVAL’,records,t);

}

As for the XSLT test here it is below:
function testXml(records)
{
//build a test string of xml with given number of records
var sxml = buildXml(records);
//load the xml into an XML DOM object as we would get from XMLHTTPObj.responseXML

var xdoc = loadLocalXml(sxml, “4.0″);
//load the global XSLT
var xslt = loadXsl(sxsl, “4.0″, 0);

var t = [];
for (var i=0; i<tests ; i++)
{

var startTime = new Date().getTime();
//browser independent transformXml function
transformXml(xdoc, xslt, 0);
t.push(new Date().getTime() - startTime);
}
done(’XSLT’,records,t);

}

Now on to the results … the one difference from my previous tests is that I have also tried the XML DOM method as PPK suggested - the results were not that great.

For IE 6 nothing has changed of course, except that we can see using XML DOM is not that quick, however, I have not tried to optimise this code yet.

IE 6 JSON - XML Benchmark
Figure 1. IE 6 results for JSON, XML DOM and XML + XSLT.

On the other hand, There are some changes for FF 1.5 in that the XSLT method is almost as fast as the JSON method. In previous versions of FF XSLT was considerably faster [4].

FF 1.5 JSON - XML Benchmark
Figure 2. FF 1.5 results for JSON, XML DOM and XML + XSLT.

What does all this mean you ask? Well as before I am assuming that the end-users of my application are going to use FF 1.5 and IE 6 in aboutl equal numbers, 50-50. So this might be a public AJAX application on the web say whereas the division could be very different in a corporate setting. Below shows the results of this assumption and it shows that almost no matter how many data records you are rendering, XSLT is going to be faster given 50-50 usage of each browser.

IE 6 vs FF 1.5 JSON - XSLT
Figure 3. Total processing time in FF and IE given 50-50 usage split.

I will have the page up tomorrow I hope so that everyone can try it themselves - just need to get it all prettied up a bit so that it is understandable ;)

References
[1] The AJAX Response: XML, HTML or JSON - Peter-Paul Koch, Dec 17, 2005
[2] Objectifying XML - E4X for Mozilla 1.1 - Kurt Cagle, June 13, 2005
[3] JavaScript Benchmarking - Part 3.1 - Dave Johnson, Sept 15, 2005

[4] JavaScript Benchmarking IV: JSON Revisited - Dave Johnson, Sept 29, 2005

.



* Structured Blogging

Posted on December 16th, 2005 by Dave Johnson. Filed under Microformat, Semantic Web, Web2.0, XML.


Paul Kedrosky chimed in on the recent introduction of Structured Blogging (SB). Paul suggests that laziness is going to prevent SB from taking off and I would have to agree. Like many Web 2.0 concepts, it puts too much faith in the hands of the user - and aside from over zealous alpha-geeks, it will likely be too much work for users to actually use.

As time goes on I am certainly finding that just using a search engine is actually faster than using del.icio.us and is less work to boot! Flickr is the one exception where tagging is actually slightly more useful [1,2] - seeing as how search engines have a hard time indexing image content ;) . This is my common conclusion from using many different online services. Sure I sign up for all the great new Web 2.0 / AJAX services … I signed up for Writely and they can use me in their stats of doubling their user base every X weeks but I am never going to use it again; not because it is not cool and slightly useful but because I am simply too lazy.

This subject also came up yesterday as I was reading one of the latest fire stoking, “Five somethings about Web2.0 / AJAX”, post [3] by Dion Hinchcliffe over on the Web 2.0 blog. Dion’s number one reason that Web 2.0 matters is because it “seeks to ensure that we engage ourselves, participate and collaborate together”. Again I can’t help but think about how lazy most people are. Sure people that are actually interested in Web 2.0, tagging and the like make it seem really great but for the most part they cannot be bothered.

For Web 2.0 to get traction beyond the alpha-geeks I think it needs to empower developers and ask less of end-users.

References

[1] More Tags - Dave Johnson, Dec 14, 2005
[2] Tagging Tags - Dave Johnson, Dec 1, 2005
[3] Five Reasons Why Web 2.0 Matters - Dion Hinchcliffe, Dec 7, 2005

.



* More Tags

Posted on December 14th, 2005 by Dave Johnson. Filed under AJAX, Business, Semantic Web, Tagging, Web2.0.


I just stumbled upon a short post by John Battelle where he asks whether tags are going to work in the long run [1].

From my point of view the only good application of tags is for data that has no computer readable meta-data - ie they are a stop-gap. Photos, movies, songs even smells (one day) are the types of information that are hard to find using a search engine. Though sooner than later we should be able to search for “sunset” and Flickr will return a picture like this. However, when it comes to web pages there is plenty of information for search engines to work with. Why use a limited set of usually homogeneous tags to define a web page on del.icio.us when you can likely find it just as fast, or faster, using a search engine instead?

Furthermore, I’m lazy, I don’t like to think up new tags for resources that I find and for the most part I end up tagging almost everything that I find with my homogeneous tag set of XML, JavaScript, blog and AJAX … go figure. So in the end tagging is only slightly better than using my favourites in my web browser.

Having said that, there is one place that tagging might actually be useful, but only to a slightly larger degree, and that is with news. Having the del.icio.us RSS feed for AJAX is great since it is essentially a human aggregated feed for AJAX news. Still, in the future I anticipate that I will likely just ask Technorati or equivalent instead.

All and all I have quickly fallen out of love with tags and the limited use they have [2].

As for the companies that are building businesses based on tagging - it seems to be a pretty good idea.

Update: found a great post about tags here.

[1] Will Tagging Work - John Battelle, Dec 4, 2005
[2] Tagging Tags - Dave Johnson, Dec 1, 2005

.



* Enterprise AJAX

Posted on December 10th, 2005 by Dave Johnson. Filed under AJAX, Business, Web2.0.


There was a recent Forbes article about AJaX [1] and what to use and not use it for. The conclusion, of course entirely valid, was that AJAX is good for applications but not if it puts into jeopardy the monetization (essentially adverstisement and search engine rankings) of your application.

I find it difficult to understand what the products from Writely, Zimbra and gOffice (all of which are very nice - though gOffice does need some polish to reach the MyWebOS [2] level) have to offer the “Business Leaders” who read Forbes? Is some management team going to go ahead and use Writely to collaborate to come up with their business strategy? Or uninstall MS Office / Open Office and take up gOffice?

So if the enterprise is not going to use hyped-up, Web 2.0′d-up, mash’d-up applications then why should they care about AJAX? In my mind there is one great place for AJAX in the enterprise and that is for Intranet applications. If some accounting person is entering / editing / whatever hundreds or thousands of transactions then using regular HTTP POST’s to update the data, it is going to take an order or magnitude longer than an AJAX solution. If you have someone managing hierarchical or master-detail type data then AJAX can provide even more time savings. So not only are we talking real savings in dollars and cents [3] but it also results in happier more productive employees on the whole. The AJAX solutions that enable those types of savings are where the business value lies.

Although currently the hype is focused on the free Web 2.0 type AJAX applications, we will soon see it shift to inside the firewall where true efficiency gains can be made.

References
[1] Cleaning Up On the Web with AJAX - Tom Taulli, Nov 23, 2005
[2] MyWebOS - Dave Johnson, July 14, 2005

[3] Measuring the Benefits of AJAX - Alexei White

.



* E4X and JSON

Posted on December 2nd, 2005 by Dave Johnson. Filed under AJAX, JSON, JavaScript, Web2.0, XML.


Given the fact that many people prefer to use JSON over XML (likely because they have not realized the beautiful simplicity of XSLT) it is interesting that ECMA is actually taking a step away from JavaScript in introducing ECMAScript for XML. There is lots of interest in it so far [1-4] and it should prove to be quite useful in building AJAX components. For a good overview check out Brendan Eich’s presentation.

Will E4X make JSON go the way of the Dodo?

[1] Introduction to E4X - Jon Udell, Sept 29, 2004
[2] Objectifying XML - E4X for Firefox 1.1 - Kurt Cagle, June 13, 2005
[3] AJAX and E4X for Fun and Profit - Kurt Cagle, Oct 25, 2005
[4] AJAX and Scripting Web Services with E4X, Part 1 - Paul Fremantle, Anthony Elder, Apr 8, 2005

.



* Tagging Tags

Posted on December 1st, 2005 by Dave Johnson. Filed under Semantic Web, Tagging, Web2.0.


I found it quite interesting some months ago when somebody posted a comment on one of my photos in Flickr asking why I had tagged it with the word “photovoltaic”. It appears that I have since taken down that photo but just take a look at this one and most people can likely see the confusion :)

I am sorry but how can we expect a couple of words describe everything about some picture to someone who doesn’t know me or know anything about the photo? At best they could say something like

“this photo is tagged with barcelona, 2005 and photovoltaic. if I Google those I find the first result is a photovoltaic conference in Barcelona in 2005 so he was probably there. but what the hell do cargo containers have to do with anything”

But when I look at that photo I think

“oh yeah that was at the photovoltaic conference in Barcelona in 2005 where I gave my talk on photon recycling and we were living in London and Ian and Annabelle came from Vancouver to visit and I felt really horrible about all those CO2 emissions from their airplane and we went to that castle in Barcelona where there was a good view of the harbour and I thought that those shipping crates looked kind of cool so I snapped this photo - I wonder what relationship this photo has with the next and previous one other than time and group? oh shit did I leave the stove on? what are the implications of cargo containers on AJAX in Spain? “

Of course this sort of thing even happens when we are talking or reading other people’s writing. Just the other day, and what actually spurred me to write this post, I posted a response to an AJAX question in a group on Google and for some reason a really picky guy replied to my answer complaining about my saying “data transport encoding”. He suggested that I meant to say “data transport formatting” because encoding _really_ means ASCII, UTF-16 etc. While dictionary.com says that encoding is “To format (electronic data) according to a standard format” - ok so it’s just data formatting - like he said. My point here is that if I said “data encoding” and that is all, then you could think that I meant DVD encoding or Huffman coding or ASCII encoding or XML encoding. Only when you take into account the _entire_ context of a statement can you ascertain the _real_ meaning. You have to take into account that I was just reading about phase modulation for wireless communication and so used the word encoding or maybe I had a bad lunch or maybe I was actually thinking a completely different word but just wrote that one instead. Looking at the problem with writing is obviously a bit far fetched but no less interesting to thing about. Incidentally the poster also objected to my use of the term “array” when using it to refer to a group of objects - he insisted it was a data-structure; I can certainly see his concern if he has just had his head in some code for a day.

And my point is what? My point is tags, even writing, is just not good enough. There is too much context to provide to give the tags their proper meaning. I may use the word “photovoltaic” to refer to the fact that a picture was taken while I was in a city attending a photovoltaic conference but I may also use it to describe an actual picture of a PV panel all at the same time.

Tags need tags.

What do other people think?

.