data visualization


I just stumbled upon a really pretty music visualization project. Flokoon uses Last.FM data to make a sleek network displays of artists.

flokoon

When one clicks the ‘i’ icon on a particular artist, an information pane pops up with artist bio and discography information.

Flokoon2

One can also browse the tagspace for a particular artist, and then browse the network via a tag.

It’s very well done. There are some items I’d like to see. For example, I’d like to have a visual representation of where I’ve traveled in the graph. If I go from Wolf Parade to Handsome Furs, I’d like that vertex colored differently or something. Also, tapping into audio for each artist would be nice.

Flokoon isn’t just about music . They use the same approach to visualize YouTube videos and images from fotolio.

Happy Flu is an interesting experiment in information diffusion on the interwebs. Jeannie, as my sole reader I’m relying on you to make sure my blog is not where the interwebs ends. Please. Via Ryan


Happy Tax Day kids! Who feels like this?

Death and Taxes

Music Mesh

Via Paul Lamere’s blog, a new music exploration tool recently came on line called MusicMesh. This application immediately caught my attention because it so closely resembles some of the novel features of Orpheus, and some of the ‘future steps‘ we wanted to incorporate. The graph based browsing, where nodes are artists or albums and the vertices represent finely grained measures of similarity, the rich information panes populated with track listings, discographies, interviews and album reviews, and the use of disparate web-based data, are all components of the original Orpheus application written in 2005. MusicMesh does this style of music exploration well.

MusicMesh also tackles some of the ‘future steps’ we imagined but never achieved in Orpheus. It is web-based, forgoing a fat client altogether; the graph is more visually appealing as it incorporates album art and artist photos directly into the graph; and it allows to users listen to full songs as they browse. In a cool twist, it actually uses YouTube videos to play the music and show some video footage while users listen. This is awesome. In general, the application makes good use of sources that were not yet available when Orpheus was developed, and makes better use of some of the sources that were available in 2004/2005.

What seems to be lacking is any personalized understanding of a users’ tastes. In Orpheus, we knew what music users had in their libraries, and we captured user rating information for that music as well. This data is potentially powerful as one begins to serve up implicit or explicit recommendations in the graph or elsewhere. That doesn’t seem to exist (yet) in MusicMesh.

The meaning of the vertices in the MusicMesh graph are also curious. The majority of first degree neighbors are other albums in an artists discography; and then there are a few random artists thrown in. It’s not immediately clear why these additional artists are presented. What is the connection supposed to represent? It is implied that the data is derived from Last.FM, but the semantic value of the connection remains unclear. Also, the value of displaying the artists’ discography in the graph is questionable. Why not show that in one of the information panes, freeing up graph space for other similar artists or recommendations?

We discovered in our Orpheus user studies that users liked being able to control what measures of similarity were used to construct connections between artists. Some users cared about collaboration, for example, while others were more concerned with label affiliation or sound similiarity. Giving users the power to choose what constitutes the vertices may improve the relevance of these connections, and by extension the browsing experience.

It’s interesting to see how they are dealing with the intellectual property issues that come with using potentially copyrighted data, including music, album art, 3rd party text, and video. Their ‘About’ section is mostly concerned with disclaimers and legalese that suggests you’ll be tarred and feathered if you do what they do: namely use someone else’s content on your site. They claim to have licensed their data from Amazon and YouTube, but I kind of doubt they actually license the music and video content. If they have, I’m sure there are a lot of companies that would love to learn how to convince big media companies to bend over and give it up so easily.

At any rate, it’s pretty cool to see people developing applications like this, and it’s fun to be able to browse around the graph and listen to the music at the same time, and see some video footage of the artist playing. Nice work!

Orpheus Logo
Explore the Musical Universe

Orpheus Visualizer
A Dynamic Graph of Artists

Despite my Waldorfian upbringing, I’m not schooled on the details of Greek mythology – otherwise I’d attempt a wise crack along those lines.

But rather than embarrass myself, I’ll just put it like this: My Master’s thesis from UC Berkeley’s iSchool (then S.I.M.S.) was a music exploration and discovery tool called Orpheus. It was well received in 2005, and ever since I’ve been promising/threatening to resurrect it and try to improve some of the more compelling features and turn them into a web-based application. This latter part is in the works, but the first two steps were to get the Orpheus server installed and running, and make the Exploration Tool/MP3 Player client available for download. These two steps, I’m thrilled to announce, are completed!

Orpheus is all about mining a wide variety of information about music from Internet sites like Last.FM, Pitchfork Music, and Wikipedia; from structured sources like MusicBrainz and Freebase; from users’ music libraries, and from user contributed metadata, and turning this information into knowledge that can be used to discover new artists and explore the musical universe in novel ways. It’s about finding compelling new means of exploration and discovery.

As several of our professors pointed out at the time, our approach could prove useful for exploring the corpora around many other types of media – books, academic articles, movies, etc. But my primary joy and passion is music, so music is the first and foremost focus of this project.

Without further delay, you can download the Orpheus client and start using it right away. If you are new to Orpheus, you might benefit from watching this slideshow first. Before you get too excited, the rich information that Orpheus collects hasn’t been updated in a while (I’m working on adding new feeds and cleaning up the feeds we currently have). So you might not find too much about new artists in Orpheus at the moment. But don’t let that stop you! Please be sure to contact me with any problems or bugs you encounter. And stay tuned for updates and new features.

Enjoy!

I’ve been moving around quite a bit this summer. Almost every weekend I’ve been in a different location: Calaveras Big Trees State Park, Philadelphia, Montréal, New York, Washington DC, Delaware shore, Guerneville, San Diego, LA, Hawaii. Usually, when I write ‘travel’ entries, it’s because I’ve traveled to some far-off location (Vietnam, Central America, Bahamas, etc.). But this summer, I will only leave the country once while managing to rack up considerable miles. The map below charts my east coast travels in June and July. I might post another map of all my travels once the summer is in the books.


East Coast June/July

Robbins and Brian Gottlock
The Newlyweds
(courtesy nytimes.com)

Old Montréal
Old Montréal
(courtesy gupshup.org)

Amtrak Adirondack
Adirondacks Amtrak Route

While on the east coast, I spent some quality time with the family, including my adorable nephew, and made the regular visits to various east coast destinations. One of the main reasons I stayed on the east coast for so long was a string of family-related events that were close enough together to make it hard and expensive to fly back and forth from San Francisco.

The “big” event on the east coast was my brother-in-law, Robbins’ (on the left in the photo above) wedding to his long time partner Brian. They married in Montréal. A destination wedding, part out of necessity and part out of fun. More on that in a minute. First I had to get to their wedding.

Montréal By Train

Based on my sister’s recommendations, I elected to take the Amtrak train from Philadelphia to Montréal. The trip was a whopping 14 hours, but was a completely gorgeous and relaxing time, despite the fact I had to rise at 4:30 am. After a brief ride up to Penn Station, the Adirondack route follows the full length of the Hudson, winding along cliffs, through pine forests and misty hills, past West Point (an institution very familiar to generations of Maurys and Bunkers, but thankfully not me). There are several stops along the Hudson, but once north of Albany, it’s nearly a non-stop trip the rest of the way. I secured two seats to myself and was able stretch out, nap, relax, read and watch the scenery roll by. Intermittently on the northern route, I got some work done. This train ride is highly recommended if you have the time and disposition.

Once in Montréal, I met up with a bunch of the wedding goers who turned out to be quite fun and entertaining. I wasn’t able to really take in the city, but what I saw I liked a lot. The old French influence and the European architecture gave me the sensation of really being in a foreign country, even though I was only an hour from the border.

My favorite area of the city was probably the Latin Quarter, where we went the first night for Robbins’ bachelor party. One thing I found remarkably backwards about Canada (or Quebec/Montréal more specifically) is that many of the gay male clubs can and do prohibit women from entering. How can a society so ahead of the US in so many ways still have rules like that? Mind you, it didn’t impede our determined group, as about 25 men and 10 women stormed one of the gay strip clubs and proceeded to watch really buff, hairless men stroke their johnsons on stage. After the strip club, we went upstairs to a club, called Unity, and proceeded to get down on the dance floor. Our party knows how to party – we promptly had about 5 guys dancing on stage, and kept going until about 3:00AM. This was the first dance episode of several that would mark the high points of the weekend, and my trip back east.

The Wedding

The following day, at a civilized 4:00PM, we all met at the St. James United Church, crowded in, took our seats, and the ceremony commenced. Despite the whole “gay” thing, the wedding was the most traditional I’ve attended in years. The ceremony was in a church, ‘traditional’ vows were exchanged, one husband is taking the other’s surname, all the guests showered the newlyweds with bubbles as they left the church, and the couples sped off in an awesome chauffeured car.

The reception was held at Le Centre des Sciences De Montréal, overlooking the fleuve Saint-Laurent (or Saint Lawrence River in English). The food was delicious and the wedding band was off the hook. They played all the predictable wedding songs, but it almost didn’t matter what they were playing, as the wedding guests, myself included, were ready to dance the night away.

By about 1:00AM, they kicked us out of the reception, and we found another club that was willing to let us dance until about 3:00AM again. I’m sure we made quite the site – about 15 of us, dressed to the nines, wearing sunglasses (for some reason), we come in, find a place in the back of the bar, and proceed to go crazy. In no time, there were people dancing on tables, shirts were gone, the dancing got dirty. Three bouncers monitored our group closely, but we committed no offense so heinous as to be removed from the club. All around, one of the funnest weddings I’ve attended since my cousin Brooke’s in Texas in 2005.

Reflections on Same Sex Weddings

I can’t write up this travel log without articulating my frustration with why we had to travel to a foreign country to witness the union of these two wonderful and loving people. This is the first of four wedding-related events I will attend this summer, and it was the second same sex wedding that I’ve attended. As I watched these two people marry, and as I’ve reflected on this topic over the last couple of weeks, I’ve grown increasingly frustrated with the U.S. legal position on gay marriage, the general population’s aversion to it, and the leading presidential candidates’ failure to take a firm position one way or the other on this issue.

One evening in Montréal, a few of us were discussing the presidential race, and one person opined that her primary issue, the one that will determine who she votes for, is the candidate’s position on gay marriage. I thought at the time this was an extreme position to take — after all, the candidate has to win a general election, and gay marriage is not supported by a vast majority of Americans. But the more I think about it, the more I agree with her. Fundamentally, gay marriage is a civil rights issue with far reaching consequences. When will our national politicians stand up and fight for equal rights? When will they decide to lead, rather than pander to the polls on this issue?

The first gay wedding I attended was in San Francisco, on the first weekend that Mayor Newsom legalized same sex marriages in that city. People came from all over the world to marry in San Francisco. The feel in the city that week was unforgettable. It was hopeful, exciting and celebratory.

We all had the sense that we were taking part in something revolutionary and historic. As most readers will remember, the state Supreme Court ordered a stop to the unions, effectively annulling the thousands of weddings that took place (although the vows and the promises made remain the same). But the political and moral point had been made. I believed then and I still believe today that this issue is the civil rights battle of our era. It’s a long struggle and won’t be won overnight. States can annul all the marriages they want, Congress can pass all of the amendments and resolutions they want, but they can’t stop the steady progression of thought. I’m confident that my children will look back on our society’s perspective on gay marriage in much the same that we regard the prohibition on interracial marriage – as backward and plain ignorant.

But until that time, gay couples remain relegated to travel – across the country, across state lines or to another country – if they wish to trade vows. Our enlightened neighbors to the North have not only legalized gay marriage, they have embraced it socially. The marriage I attended in Montreal felt like any other marriage, it was a union of two people who love each other very much, and a union of their friends and communities. Here’s to the day when we attend destination weddings solely for the fun of it, not out of legal obligation.

IMG_0167
Explore the Musical Universe

Okay, so maybe you can’t actually use Orpheus right now, but you can finally take a look at the interface and get a sense for the application. I spent a few hours this weekend taking screen shots of my beloved Master’s thesis (created with Vijay Viswanathan and Jeannie Yang) and putting them online with some explanatory text. You can check out a slide show of Orpheus’ main functionality right here. If you’re looking for the database documentation or older project documents, you’ll still need to visit the thesis site. I’d like to say in the coming weeks you’ll see more content online about Orpheus, but that’s probably not going to happen. The main motivation to post this stuff online is that as a professional I have something to point people to when I talk about what Orpheus did and what it is capable of.

As part of this process, I also compiled and installed both the client and server on my local media machine. Orpheus still looks cool, and is still enormously fun and satisfying to play with. The music landscape has changed a lot in the last two years, but I still love the idea of Orpheus.

Recent Listening Screen Shot one
Recent Listening Screen Shot

Last Sunday I wrote a PHP tool that displays my recently played tracks in the side bar of my blog. I used the web services APIs from three sources to do this: Last.FM (recent tracks), MusicBrainz (album name), and Amazon (album art, label, etc.). My main motivation for writing this application was to replace the “Now Playing” application that provided similar functionality for my blog. I lost that plug-in, along with the license key, when my PC crashed a few weeks ago. I could have re-installed the Now Playing plugin, or used one of several other plug-ins for WordPress out there, but I wanted to see how easy or hard it would be to do this myself. I considered this exercise a baby-step along the way towards migrating the browsing and discovery capabilities of Orpheus from a fat client application to a web-based tool. There are miles to go before I get there, but this is a start.

I called this tool a ‘mash-up‘ in the title, and to the extent that it fits wikipedia’s definition of a “web application that seamlessly combines content from more than one source into an integrated experience,” it may loosely be considered one, provided we remove the adverb “seamlessly” from that description. Hitting up 3 data sources iteratively produces some unseemly latency. I could have removed MusicBrainz from the equation if Last.FM published the album name in their XML feed of recent tracks, but they don’t. So 3 web services it is. At any rate, this is my first mash-up, so yay for me.

And yay for you, because I’m posting the code here for others to use. It’s been tested in WordPress and Mozilla Firefox. It is a tad slow, but easy to configure and use. Be aware, there is little in the way of error handling, so if any of the 3 web services has problems, all goes to hell in a handbasket. I’ve seen this happen on the MB query when I have crazy long track names, usually on Classical music. This code is licensed under Creative Commons‘ “Attribution-NonCommercial-ShareAlike 2.5” license. For those interested in how I built this, and what I learned in the process, read on!

Before I started hitting up various web services, my first brilliant idea was to hack the iTunes API, take all of the relevant track metadata, query Amazon for album art and all kinds of other good stuff, and post it to my blog as XML for parsing. This is exactly what Brandon’s tool does, so I would essentially be rebuilding his system with less and different features to suit my needs. Of course, this approach required that I know C/C-Objective, which I don’t. After nodding off reading some code examples, I decided to defer my mastery of C for a later date. Ultimately, if I am going to migrate Orpheus to the web, I’ll need some simple little iTunes plugin, but that can wait. I discovered during my research that it is possible to query the iTunes XML database directly without working through the iTunes API, providing a real time “snap shot” of the library. But there are challenges with doing this as well, and most of the data I could get from the iTunes DB I could get elsewhere. For now, I would avoid working with any local data at all, and rely exclusively on existing web service data and only one local plug-in, AudioScrobbler.

I was already using the AudioScrobbler plug-in for iTunes to post my listening behavior to last.FM. And, bless their hearts, last.FM is kind enough to offer up a WS API for accessing said data (as well as much more!). So I could get a live, on demand XML representation of my recently listened-to tracks via the Last.FM Web Service. As I mentioned earlier, Last.FM’s web service for recent tracks doesn’t return all of the metadata about a track. Most notably missing is the name of the album. Without the name of the album, an artist name and a track name only provide a partial picture of the song in question. Most ID3 tags describe the album name, so why isn’t it available on my recent listening tracks XML feed?

I don’t know if this ‘bug’ is related to the data the audioscrobbler plugin sends to last.FM, or last.FM just not publishing the track data in its entirety. Whatever the reason, I needed the album name in order to build a useful query for Amazon. I decided to use MusicBrainz to attempt to determine the album name. MB’s Web Service is cool, but somewhat ill-suited for my very specific and unusual request. I needed to know, given an artist name and a track name, what the most likely album was that the track appeared on. This is admittedely an ass-backwards way of going about things, but I needed that question answered. Tracks, naturally, can show up on a variety of albums — the Single, the EP, the LP, the bootleg, the remix, etc. My queries returned some peculiar results in a few circumstances, so I decided to employ some additional logic to decide if the album name returned from the MB query was reliable enough to use. This approach means I don’t get the name of the album in a lot of circumstances, which sucks. You can see how several of the albums have no cover art. If can find the (correct) album on MB, the code will query the Amazon web service for album art and all the other goodies they have.

Once all of the data is collected, it gets parsed and posted as an unordered HTML list. Links to Last.FM and Amazon pages are included, and mousing over the image or track listing will show what time the track was played (in London time, unfortunately…). Pretty spiffy.

All of this was done using REST requests (no Soap here) and PHP to parse the resulting XML files. I avoided using XSLT for processing the XML because my web server doesn’t have XSLT enabled in PHP. Plus, the data needed to get into PHP at some point, so I decided to just do the parsing in PHP using xml_parse_into_struct. I relied on several great resources to build this. These two were the most useful. Visit my del.icio.us site for other useful sites.

Download recentListening here. Feedback is always appreciated. Except negative feedback. Keep that to your bitter self!

Last Friday I went to see Chris Anderson discuss “The Long Tail of Time,” at the Long Now Foundation’s Friday “SALT” seminar. Chris presentated a new twist on his fascinating research about The Long Tail (more detail here, here, and here). He’s got a book coming out that I’m very much looking forward to reading.

Without going into any detail (read Chris’ book instead…), the “long tail” refers to the long, flatter tail of a power distribution. The yellow part on the graph to the right shows the long tail of a power curve. The feature of a power law that is of most interest to Internet-based businesses is that the total volume of the long tail can be equal to or greater than the total volume of the ‘head’ part of the curve. Therefore, all things being equal, there is as much market potential in the long tail, the numerous ‘obscure’ products, as there is in the steep part of the curve.

Traditional businesses often overlook products with a small potential consumer base because focusing on niche products would mean less space on their shelves for more popular (and thus more profitable) products. Internet-based businesses such as Amazon and Netflix can avoid this problem because their ‘shelf space’ is effectively unlimited. It costs them nothing, and does not negatively affect the most popular products, to keep certain obscure works in their inventory. The recommendation features of these sites also help to push demand down the tail, making works more profitable that might have been otherwise ignored. Thus, the Internet and accompanying technologies have opened the door to niche products, be they independent recording artists, obscure authors, or funky products sold on eBay.

Chris’s talk gave a concise overview of the Long Tail phenomenon, and then moved on to the actual subject of the talk, the notion that there is a temporal element to the Long Tail. Essentially, Anderson was describing how many of the products inhabiting the long tail are old products, classics, cult favorites, etc. The premise is that there is a market for these works, and by making them available, businesses might tap into new or undiscovered markets. It is not just ‘classic’ products that might benefit from monetizing the long tail. Recent ‘big hits’, best-sellers, hit albums, blockbuster movies, etc. that lose the huge popularity they enjoy might enjoy a much longer, if much smaller, interest as the work moves from the best seller list to the long tail. As an example, think of a John le Carré novel (I’m presently reading Absolute Friends…). Years ago, his work was on the bestseller lists, available in airport bookstores and big chain stores everywhere. But over time, new authors appear, new bestsellers are made, and older novels begin to fade from the consumer consciousness. Yet there remains people like me who want to buy and read these old best sellers. Why shouldn’t we have access to them?

And who says that just because a book loses popularity that it will continue down that path to obscurity? This ‘decay function’ in bestsellers is not in fact a one-way road to obscurity. In the case of le Carré, the success of the movie The Constant Gardener reinvigorated interest in his books, and his novels are enjoying a renaissance. Six months ago, I couldn’t find le Carré’s novels (with the exception of The Constant Gardener) at my local Cody’s or Border’s. They told me that they were no longer carrying his books. Yet last week, I walked into Cody’s in SF and found three of his works. Le Carré essentially went out of print, then as the result of external factors, his works gained popularity again and worked their way back up the tail, and back onto bookshelves at independent stores. Who knows if this would have happened without the proven demand that the Long Tail illustrated.

Many believe the economics of The Long Tail portends the death of the centralized gatekeeper. Large media companies (record labels, news corporations, publishers, major booksellers, etc) traditionally need to invest considerable resources in market analysis, determining what products have the greatest potential for success. They act as the gatekeeper of our culture. Most products never make it to market because somebody thinks they won’t be profitable. Chris calls this “pre-filtering” or something like that (he should know all about pre-filtering, since he is the editor at Wired). But it is possible we can move away from pre-filtering and let more products get onto the market, and rely on consumers to decide if the product is worthwhile. To be sure, media companies will still need to invest up-front for some projects like expensive blockbuster movies, but many other avenues for entertainment will open up if the cost of production and distribution are zero.

What would happen if we just opened the gates to content production, so that anyone, anywhere could produce an album, make a movie or write a novel and have it accessible to hundreds of millions of potential consumers? We might not need to answer this question theoretically, because it appears that our culture and our economy is headed in this direction.

A former professor of mine used to talk about “creating the technology and applications that will enable daily media consumers to become daily media producers”. I used to think this was fundamentally a bad idea. How much more amatuer crap do we need out there? Yet part of building the technologies and applications that enable ordinary citizens to be producers of media is building the technologies to sift through the greater volume of content to find the good stuff.

With better search technologies, tagging, collaborative filtering and print on demand, we can not only find the information we are looking for easier, but we can encounter works that were buried in history, bound to rot away in libraries. What if all those old rotting books, and all of our new, unfiltered content was indexed and available for our pursual? The cover story of last Sunday’s New York Times Magazine touched on this (I recommend reading the full article…):

“If you can truly incorporate all texts — past and present, multilingual — on a particular subject, then you can have a clearer sense of what we as a civilization, a species, do know and don’t know. The white spaces of our collective ignorance are highlighted, while the golden peaks of our knowledge are drawn with completeness. This degree of authority is only rarely achieved in scholarship today, but it will become routine.”

In short, there is tremendous value in making all works ever created available to be searched, indexed and discovered. The Long Tail is home to such a rich and diverse set of cultures. We should embrace it. I believe now, more so then I did a year ago, that opening the doors to the playground, making all content available to consumers and potential producers will surely result in much richer, more diverse and more creative content. To be sure, there are many hurdles to overcome, most of them legal, but ultimately, it appears this is what our future looks like. It is a bright day for the indy artist, researcher and student; and a bad day for independent, physically situated businesses, particularly booksellers and record shops, as the closure of Cody’s in Berkeley illustrates. Major content producers and distributors may lose their role as gatekeepers of our culture, but I think they are creative and savvy enough to find a new role in the Long Tail.

This is a quick moving area, and one that has tremendous implications for the content we create and consume, the business strategies that are built around this content, and the laws that govern it. On a closing note, I really enjoyed Chris’ presentation style. He used PowerPoint in a very refreshing way. There were no bullet points, no stupid slides with a sentence or two, or goofy MS Office graphics on them. Rather, each slide was a rich graph or chart, illustrating his talking points with well-visualized data. The talk was meandering at times, but it seemed that at each intersection, for each new topic, he had some great chart to highlight his point.

Apropos my previous entry on ZoomCloud, Paul Lamere posted a blog entry about a pretty cool little perl tool called Album Art Cloud (written by Andrew Hitchcock). This tool leverages Musicmobs’ open WS API to build a composite image of album cover art based on play counts of various songs. Here is mine (dynamic, browsable version available here):

Brooke's Album Cloud 2006.03.20

The cover art from albums listened to in greater frequency are displayed in a larger size, similar to how tag/word clouds display words with more frequent occurances in a larger font. In theory, this approach is a more interesting way to view the listening habits of a person or group of people. Instead of just looking at a bar chart or play count frequencies, one gets a visual map of album cover art, which one can mouse over to see play frequency, or click on to look at the album’s page on Musicmobs. What sites like Musicmobs, Last.FM, MusicStrands, etc. lack is good visualizations for their data. That’s why an Album Cloud is such a cool idea. It takes the a rich source of data and builds a great visual browsing interface.

As Andrew notes, there are some problems with actually implementing the cloud. Evidently, Musicmobs only has cover art (pulled from Amazon’s API) for a small percentage of albums tracked. If the Art Cloud program can’t find the album art for a particular album, it won’t visualize the frequencies for that album. This means that the cloud doesn’t accurately reflect play count. Compare, for example, my album art cloud above with a bar chart of my play counts on Last.FM:

Notice how some of my big favs of late are not present in the album cloud. Clap Your Hands Say Yeah, which I evidently listened to obsessively for a while, isn’t even in the Album Cloud. Ditto with Handsome Boy Modeling School. I am sort of comparing apples and oranges, since the Album Cloud is aggregated along albums, while the last.fm chart is at the artist level, but the point remains.

The problem here is that the cloud misrepresents a users’ listening habits — and more problematically, tends to skew the data towards more popular music. Popular music is more likely to be found on the Amazon WS API (depending on how the query is constructed), so less popular music, or music out on the long tail, gets overlooked.

There should be some way around having so much missing album art. Perhaps Musicmobs could straighten it out by refining their query against Amazon’s API. When we worked on Orpheus, we also used the Amazon API for retrieving cover art. Anecdotely, it seems like we were better at finding the art. But then again, we also used MusicBrainz to clean up users’ tags before submitting queries to Amazon. I’m not sure if Musicmobs does that.

A lack of album art on Amazon could become increasingly problematic as the music landscape changes. ‘Mash-ups’ generally don’t show up on albums, and aren’t sold on Amazon. Many smaller bands don’t sell their albums on Amazon, so the cover art isn’t there. As I said, if tools like this miss the long tail, they will become increasingly inaccurate. Perhaps an interim solution would be to just default to text when album art isn’t available. At least then the tool wouldn’t misrepresent the users’ listening habits.

Another problem with the Album Art Cloud is that is relies exclusively on play count to derive someone’s ‘favorite’ albums. There is other metadata that can help determine the users’ favorite music, such as ratings, repeat plays and appearance on multiple playlists. Since this data is available in the iTunes XML file, it should be available to Musicmobs as well, and thus to Andrew’s kickass tool. It would be nice to perhaps provide an option where users who rate their music (which I think is a relatively small pool of users) could include their ratings as part of the weighting scheme that determines the size of cover art. Thus, a song played only once, but rated a 5, could receive as large a rating as a song (or album) played 5 times with no rating. There are problems with this approach as well – Users have different ways of using the rating systems, and there is no consistent semantic interpretation for what the ratings actually mean. But it’s an idea. The other option would be to remove the semantic implication of ‘Favorite’ albums from the tool — instead calling it what it is — an Album Cloud based on frequency. This is what clouds typically represent anyway.

All in all, this is a cool little tool. Thanks Andrew!

On a related note, I just started using Musicmobs. It’s pretty cool. I especially like the playlist comparisons it provides, and the ‘similar listeners’ list. One can view other playlists or libraries that bare a resemblance to yours, see what songs you have, and which ones you don’t. In some circumstances you can even stream a section of a song that isn’t in your library.