Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Music Businesses Media Math Apple

Crunching the Math On iTunes 276

markmcb writes "OmniNerd has posted an interesting article about the statistical math behind iTunes. The author makes some interesting observations concerning the same song playing twice in a row during party shuffle play, the impact that star ratings have on playback, and comparisons with plain old random play (star ratings not considered)." From the article: "To test the option's preference for 5-stars, I created a short playlist of six songs: one from each different star rating and a song left un-rated. The songs were from the same genre and artist and were changed to be only one second in duration. After resetting the play count to zero, I hit play and left my desk for the weekend. To satisfy a little more curiosity, I ran the same songs once more on a different weekend without selecting the option to play higher rated songs more often. Monday morning the play counts were as shown in Table 1."
This discussion has been archived. No new comments can be posted.

Crunching the Math On iTunes

Comments Filter:
  • by ReformedExCon ( 897248 ) <reformed.excon@gmail.com> on Sunday August 28, 2005 @05:52AM (#13420075)
    I'm looking at this data and it seems that iTunes does seem to pick out favorite songs more often than not-so-favorite songs. Which, I suppose, is the whole idea behind the Party Shuffle concept.

    So after analyzing all that data, how does Brian Hansen come to the conclusion that "it's simply the mind's tendency to find a pattern that makes you think iTunes has a preference". Uh, no. It's the software learning that you have a certain type of genre or style that you strongly favor and will selectively pick songs that are related, thus giving you a better-selected playlist.

    And it seems that the program has a bug in that it will play a song twice in a row. That's a real bug (if you don't like that type of thing).
  • Interesting (Score:5, Interesting)

    by hattig ( 47930 ) on Sunday August 28, 2005 @05:57AM (#13420089) Journal
    I wish iTunes would get ratings from some online source much like it gets tracknames from Gracenote. Can you imagine a server of user-submitted ratings? You could opt to use an average rating from all users, or a rating from users with particular tasks (i.e., if you are a metaller, then you'll probably not want raver's musical opinions affecting your ratings!).

    Why? Because I haven't got the time to go around rating my entire music library. Judging from that article, it is dangerous to only do a few because of the weighting algorithm used - surely it would be more sensible to assume that 'not rated' meant 3 stars rather than 0 stars? That way you could rate down shitty songs, and rate up excellent songs, but ignore rating the vast majority of songs.
  • Re:Interesting (Score:4, Interesting)

    by the_unknown_soldier ( 675161 ) on Sunday August 28, 2005 @06:03AM (#13420101)
    I only rate good songs... Indifference is the worst rating you can give to a song, so i think "0" fits in pretty well

    As for Gracenote: perhaps sales on the ITMS could act as a gauge of this. e.g. "This is this artist's most downloaded song and this artist compared to similar ones is bought 5x as much, so our algorithms suggest it should be rated 5" Then once you have downloaded it you can change it if you get the time.
  • by hattig ( 47930 ) on Sunday August 28, 2005 @06:08AM (#13420112) Journal
    You can only play the same song twice in a row if the algorithm reshuffles the songlist after every song played.

    If you do a static shuffling, i.e., a shuffle at the beginning of playback, and then trudge through the playlist that was generated then you will certainly get each song played the same number of times, and you won't get repeats. The only chance of getting a repeated song is if the last song of a shuffled playlist is the same as the first song of the next shuffled list, which is 1/n^2.

    You can combine the two however. Have 6 queues, one for *****, another for ****, and so on. Each queue would have its own last-played pointer. Each queue would be randomly shuffled once, until all songs in that queue have been played. Then have your weighting algorithm merely choose which queue to play from, and then play the next song in that queue.
  • Re:Reminds me of... (Score:4, Interesting)

    by hattig ( 47930 ) on Sunday August 28, 2005 @06:15AM (#13420124) Journal
    Maybe songs need more than one rating.

    Rating For Morning Listening (* for Aphex Twin, Slayer, etc)
    Rating For Afternoon Listening (**)
    Rating For Evening Listening (****)
    Rating For Party Listening (**)
    Rating For ${mood} Listening

    Then instead of getting work done we can spend out entire lives rating music.
  • Re: Try last.fm (Score:5, Interesting)

    by P!Alexander ( 448903 ) on Sunday August 28, 2005 @06:33AM (#13420170)
    That's exactly why I love last.fm [www.last.fm] (formerly Audioscrobbler & Last.fm). It automatically tracks what you listen to and then allows that information to be used to give you neighbors in the music world based on what interests you have in common. You can add friends, join groups, and even tag your music. All of this is extremely useful in finding new stuff. They've got plugins for all the major media players (and even some minor ones).

    Add on top of that the ability to play a custom-built radio station, set it to play only new music or listen only to music from a particular user profile.

    Linux and BSD supported! Open source plugins and radio station player! Could it get better? ;)

    ---
    but make sure that the last line
    Generated by SlashdotRndSig [snop.com] via GreaseMonkey [mozdev.org]
  • by Anonymous Coward on Sunday August 28, 2005 @06:58AM (#13420220)
    David
    sorry to hear that you business is stalling. Clearly I don't live in your neighborhood (or even your country), but my experience of downloading music has been different: I hear it on the radio (community radio), if I want to hear it again I download it, if I like it I go and buy it. If I don't like it enough I don't buy it - sort of like podcasting music. Almost every CD I have bought in the past 3 years has been bought this way (that's 1 or 2 a week). I'm buying more music now than I did before I started downloading music.
    Perhaps your 'family demographic' is the wrong business strategy for you these days as these 'family music' buyers are downloading but not buying.
  • by ciroknight ( 601098 ) on Sunday August 28, 2005 @07:07AM (#13420239)
    Your decimals look more like the pricing model than the weights for playing songs..

    5 star - .285 -- $299, iPod (full?) 20gb
    4 star - .238 -- $249, iPod mini 6gb
    3 star - .190 -- $199, iPod mini 4gb
    2 star - .143 -- $149, iPod shuffle 1gb
    1 star - .095 -- $99, iPod shuffle 512mb
  • by hattig ( 47930 ) on Sunday August 28, 2005 @07:17AM (#13420252) Journal
    Music has turned from something that you collect and treasure into something you have and listen to practically all the time. It is very rare you decide you want to listen to just song X these days (in comparison to how much music is listened to overall), and actively put it and actively spend the time solely listening to it. Large mp3 collections have replaced radios at many places, great for getting rid of the music you really dislike and the DJ.

    Would I pay to have my music rated by an external algorithm? No. Would I pay to have my music peer rated? No - I'd also be contributing back to it like I contribute back to Gracenote and FreeDB.

    I suppose it is easiest to just rate everything *** and apply ****/***** and **/* to the tracks I really notice as standing out.
  • Re:Interesting (Score:5, Interesting)

    by Stuart Gibson ( 544632 ) on Sunday August 28, 2005 @07:21AM (#13420262) Homepage
    Select All -> Get Info -> My Rating -> Three Stars.

    Rate up and down others as necessary. OK, not the point that default should be doing this for you, but a quick fix if you want it to work that way.

    If you already have songs rated then create a 0 star smart playlist and repeat.

    Stuart
  • by wackybrit ( 321117 ) on Sunday August 28, 2005 @07:51AM (#13420315) Homepage Journal
    Most people follow a bell shaped curve for their ratings, with the 3-star rating being the most common.

    I mean, where is this statistic coming from?

    In my case the majority of rated songs are 5's, almost the same number of 4's, then some 3's, and hardly any 2's or 1's.. with perhaps 50% left unrated. I use iTunes at least several hours a day. Those of my friends who use iTunes seem to have a similar distribution.
  • Re:Reminds me of... (Score:4, Interesting)

    by Gorath99 ( 746654 ) on Sunday August 28, 2005 @07:56AM (#13420322)
    I was thinking about something like this myself. Basically, what I'd like to have are two flags:

    1: Never play unless I explicitly say so.
    2: Don't include in shuffle.

    The first one I'd use to flag interviews etc. that are sometimes included on albums. Is not necessarily bad content, just something that you don't generally need to hear multiple times.

    The second one is for flagging things like Beethoven's 9th. It's really good music, but you don't want 67 minute long pieces in a random playlist.

    I currently just use the 1 and 2 star ratings for this, but it's not really ideal. It's too bad (but understandable) that iTunes has no option for looking at TXX frames [id3.org] or I could implement it in a better way.
  • by Soul-Burn666 ( 574119 ) on Sunday August 28, 2005 @08:01AM (#13420332) Journal
    Moreover, following the "Birthday Paradox" [wikipedia.org], if you have N songs and the selection is completely random, then in a list of sqrt(N), there's a 50% chance a song will appear twice.
    For 4000 songs, that's around 64~ songs. So if your player chooses tracks completely randomly then 50% of the times you'll listen to 64 songs, you'll hear the same song twice from those 64.

    Even if your player doesn't play the same song twice, if you have 8000 songs from 4000 artists, 2 songs per artist, then you get a similar calculation.
  • Re:Reminds me of... (Score:1, Interesting)

    by Anonymous Coward on Sunday August 28, 2005 @09:08AM (#13420505)
    Those of you who know the album can appreciate that it's not the kind of music that you'd maybe choose as everyday listening material.

    I like it... but then most of the rest of my music is just as weird.
  • Modal Music (Score:5, Interesting)

    by Johnny Mozzarella ( 655181 ) on Sunday August 28, 2005 @09:35AM (#13420572)
    A friend of mine who worked at a radio station that played a very diverse range of music told me how they select music.

    She said that research had shown that listeners would rate the same song higher if it followed other song of a similar genre. If they play songs of different genres randomly the listener does not enjoy the music as much.

    So their tendency is to play "blocks" of music.
    For example....
    4 Classic Rock songs
    3 Blues Songs
    3 Folk songs
    4 Female Rockers
    3 Grunge
    etc.

    This is common knowledge in the radio world. I wonder if Apple has incorporated this type of logic into it's iTunes algorithms?

    The radio station in question is WXPN and can be found under iTunes > Radio > Public > WXPN
  • Re:Reminds me of... (Score:4, Interesting)

    by Lars T. ( 470328 ) <{Lars.Traeger} {at} {googlemail.com}> on Sunday August 28, 2005 @09:49AM (#13420615) Journal
    Use unique strings (as many as you like) in the Comments tag. Like "-don't play unless I say so-" or "-don't include in shuffle-". Then build intelligent playlists accordingly (Comments doesn't include "-don't play unless I say so-"). Errm, better use something shorter like "-DPlay-".
  • by mpiktas ( 740253 ) on Sunday August 28, 2005 @10:35AM (#13420760)

    After reading the article, I still do not understand the iPod's shuffling algorithm.

    The first half of the article is devoted to describing how the writer got the probabilities of rated songs and properties of these probabilities. Although these probabilities give some insight to the shuffling algorithm, they are pretty useless, since they are observed from unrealistic list of songs, i.e. 6 songs with different ratings.

    Then cames the formula in Figure 2. How it is calculated and where from the author takes it, is not explained in the article. Also this formula is not backed up by empirical observation. The rest of the article is devoted to analyzing the effects of this formula, which are interesting, yet could have no importance if the actual formula is different. So this article does not really explore the iPod shuffling algorithm, but explores how would iPod shuffling algorithm work if the probability of the next song is calculated according to the formula provided by author. That is pretty useless, since we all can provide our own formulas and write the articles.

    Now concerning this formula. To me it seems a litlle strange. Consider hypothetical situation of song list with 1000 unrated songs, and one with 5 star rating. The the probability (according to the formula provided by author) that the song with 5 star rating would come up is
    0.27/(1000*0.039+0.27) = 0.006875477
    which is pretty miserable odds. If I rated it so highly, that means I want to hear it a lot, now with such shuffling algorithm, I would hear it slightly more, yet not a lot. Of course, then I could create a playlist, with this song only, but then why one needs rating system, if it does not perform.

    So it would be really interesting to know iPod's shuffling algorithm, to see if it saves the hassle of creating your own playlists. (Or even the possibility to provide your own algorithm), yet this article does not provide any insights.

  • Re:Reminds me of... (Score:5, Interesting)

    by Dixie_Flatline ( 5077 ) <vincent.jan.gohNO@SPAMgmail.com> on Sunday August 28, 2005 @11:51AM (#13421016) Homepage
    I use the stars to indicate how often things should be played.

    * - Never play. It's only in the list for the sake of completeness (I hate having partial albums)

    ** - Play very rarely. If I'm in the mood, I might listen to it.

    *** - I'll listen to it at least once a week. If it comes up randomly on the shuffle, I won't take it out of the list.

    **** - I can listen to this several times in a day.

    ***** - I'll listen to this song anytime, anywhere. If it comes up twice in a row, no problem. If my playlist only has this song on it, I can cope with that for at least a few hours.

    This means that I have to periodically re-rate the songs. That seems only reasonable, though. Why would songs stay at the same rating forever? As the novelty wears off, I can relegate a song to 4 or 3 stars.

    I also keep extensive smart playlists that make sure that songs that are 3 stars or less only get played once every few days.
  • by Bert690 ( 540293 ) on Sunday August 28, 2005 @01:37PM (#13421427)
    OK, after a bit more thinking, you were indeed very close. It appears the actual formula is:

    points(0 stars)=1
    points(1 stars)=3
    points(2 stars)=4
    points(3 stars)=5
    points(4 stars)=6
    points(5 stars)=7

    probability(X stars) = points(X stars) / 26

    This yields the following probabilities, listed along side the observed values from the article along with 95% condience intervals.

    p(5 star)=.2692 [.270 +- .0038]
    p(4 star)=.2308 [.230 +- .0036]
    p(3 star)=.1923 [.189 +- .0033]
    p(2 star)=.1538 [.154 +- .0031]
    p(1 star)=.1154 [.118 +- .0027]
    p(0 star)=.0385 [.039 +- .0016]

    As you can see each computed probability falls within the 95% confidence interval, so there's a good chance this is the correct forumla.

    Boy do I have too much time on my hands today.

  • Re:Ok... (Score:1, Interesting)

    by Mozk ( 844858 ) on Sunday August 28, 2005 @03:16PM (#13421852)
    Things are only unpredictable in quantum physics because we can't predict things on that small of a level (yet). Unless you have studied quantum physics and seen experiments in action yourself, then you can't say it's true. That sounds close-minded but that's what I believe. Until I study quantum physics myself, I'm going to believe in determinism.
  • Dear Apple (Score:3, Interesting)

    by NeMon'ess ( 160583 ) <{flinxmid} {at} {yahoo.com}> on Sunday August 28, 2005 @04:06PM (#13422126) Homepage Journal
    Let me, an advanced user who knows a damn thing about computers and interfaces, change the weighting of the stars. I don't want my 5-star songs to play just twice as often as 2-star songs. I want them to 6 times as often. I want 4-stars to play 4 times, 3-stars to play 3 times, and no-stars to play 2 times.

    Why no-stars? Because that way the majority of the collection is unrated. Stared songs really stand out in a playlist. 1 and 2 star songs play less often than no-stars, while 3, 4, and 5 play more often. But I want my favorites to play much more often than your arbitrary algorithm.
  • by Quevar ( 882612 ) on Sunday August 28, 2005 @04:53PM (#13422362)
    I dislike hearing a song too often and I also like to hear my favorite songs more often, but also not repetitevly. So, I have set up 6 playlists that gives me what I think is a good mix of my songs. They are as follows:
    1: rated 2 and 25 rated by least recently
    2: rated 3 and 73 rated by least recently
    3: rated 4 and 70 rated by least recently
    4: rated 5 and 32 rated by least recently
    playlist is any of playlists 1, 2, 3, or 4 with live updating

    The numbers (25, 73, 70, and 32) come from multiplying the number of songs in each category by the rating-1, so it is essentially the same as the "play higher rated songs" in PartyShuffle. I leave 1 rated songs for ones that I don't listen to very often. This way, I get a random selection of my music that does not repeat a song until I have more-or-less gone through the rest of them in that rating. And, it generally plays the 5 rated songs about 4 times more than the 1 rated songs.

    I found that I do not like the random feature since it often will play one song significantly more than another song. Eventually, it would even out, but in the range of 20 times playing a song, there can be a large discrepancy and I haven't heard some songs in longer than I'd like.
  • Re:Reminds me of... (Score:2, Interesting)

    by Salvo ( 8037 ) on Sunday August 28, 2005 @04:58PM (#13422385)
    I have three playlists on my iPod, which I use to sort and rate Music.
    'Unrated [infinity]' restricts the selection to Songs with a Rating of -----.
    'Unrated 50' also restricts it to the 50 least recently played songs.
    Finally, '***+' selects songs with a rating of ***--;, ****-, or *****, and then only the 100 least recently played songs.
    All these playlists are Live Updating.

    When I need to rate songs, I play 'Unrated [infinity]' or 'Unrated 50', and rate songs as I play, using the same rules as the Parent. When I just want to listen to good music, I listen to '***+'.

    Also remember that unlike Slashdot, iTunes and the iPod can display Unicode characters, so you can use Real stars and Infinity instead of Asterisks and whatever.
  • by AlpineR ( 32307 ) <wagnerr@umich.edu> on Sunday August 28, 2005 @07:02PM (#13423019) Homepage
    Nice analysis of iTunes. I'm somewhat surprised at the small difference in play frequency between 3, 4, and 5 stars; and disappointed that unrated songs are almost never played. In my collection, unrated means that the music is new to my collection. I think 1 star should be the kiss of death, not a blessing upon a previously unrated song.

    But all this talk of 0, 1, 2, 3, 4, 5 has me thinking of another rating system. Would anybody care to do an analysis of the ratings in Slashdot comments? What are the relative populations (I expect a ton of 2's but how about the rest)? Do comments made in the first hour after a story is posted stand a better chance of reaching +5 than comments made later in the day?

    One of my gripes about the Slashdot comment system is that it discourages contemplation and discussion. Comments made more than 24 hours after a story is posted are rarely read and almost never moderated. This is in contrast with comments system like Usenet or other bulletin boards, where threads can remain lively for weeks.

    AlpineR

  • by R3d M3rcury ( 871886 ) on Monday August 29, 2005 @12:40AM (#13424675) Journal
    Actually, with my TT, I've noticed a similar situation.

    First, certain burned CDs (I have yet to see this on a commercial CD) when played on random will only play the first track. However, if it is not played on random, it will play all the tracks. Interestingly enough, though, if I spin the CD in the holder, that will sometimes allow it work correctly.

    The other interesting thing is that the CD player will not repeat tracks until all tracks of the CD have been played (Duh), at which point all bets are off. While I have never seen it play the same track back-to-back, I have seen it play a track, another track, and then the first track again. Note that this only happens once the CD has randomly played all tracks.

    I told the Audi dealer about this. They pretty much said, "Yeah, so?" I sort of agree with them. Of course, as soon as I have the cash, I'm getting an ice>Link [densionusa.com] and an iPod and I'll toss the CD player.

UNIX is hot. It's more than hot. It's steaming. It's quicksilver lightning with a laserbeam kicker. -- Michael Jay Tucker

Working...