Slashdot Log In
Crunching the Math On iTunes
Posted by
Zonk
on Sun Aug 28, 2005 04:45 AM
from the not-completely-random dept.
from the not-completely-random dept.
markmcb writes "OmniNerd has posted an interesting article about the statistical math behind iTunes. The author makes some interesting observations concerning the same song playing twice in a row during party shuffle play, the impact that star ratings have on playback, and comparisons with plain old random play (star ratings not considered)." From the article: "To test the option's preference for 5-stars, I created a short playlist of six songs: one from each different star rating and a song left un-rated. The songs were from the same genre and artist and were changed to be only one second in duration. After resetting the play count to zero, I hit play and left my desk for the weekend. To satisfy a little more curiosity, I ran the same songs once more on a different weekend without selecting the option to play higher rated songs more often. Monday morning the play counts were as shown in Table 1."
This discussion has been archived.
No new comments can be posted.
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
Full
Abbreviated
Hidden
Loading... please wait.
iTunes is a monopoly (Score:3, Funny)
I bought the store about 12 years ago. It was one of those boutique record stores that sell obscure, independent releases that no-one listens to, not even the people that buy them. I decided that to grow the business I'd need to aim for a different demographic, the family market. My store specialised in family music - stuff that the whole family could listen to. I don't sell sick stuff like Marilyn Manson or cop-killer rap, and I'm proud to have one of the most extensive Christian rock sections that I know of.
The business strategy worked. People flocked to my store, knowing that they (and their children) could safely purchase records without profanity or violent lyrics. Over the years I expanded the business and took on more clean-cut and friendly employees. It took hard work and long hours but I had achieved my dream - owning a profitable business that I had built with my own hands, from the ground up. But now, this dream is turning into a nightmare.
Every day, fewer and fewer customers enter my store to buy fewer and fewer CDs. Why is no one buying CDs? Are people not interested in music? Do people prefer to watch TV, see films, read books? I don't know. But there is one, inescapable truth - Internet piracy is mostly to blame. The statistics speak for themselves - one in three discs world wide is a pirate. On The Internet, you can find and download hundreds of dollars worth of music in just minutes. It has the potential to destroy the music industry, from artists, to record companies to stores like my own. Before you point to the supposed "economic downturn", I'll note that the book store just across from my store is doing great business. Unlike CDs, it's harder to copy books over The Internet.
A week ago, an unpleasant experience with pirates gave me an idea. In my store, I overheard a teenage patron talking to his friend.
"Dude, I'm going to put this CD on the Internet right away."
"Yeah, dude, that's really lete [sic], you'll get lots of respect."
I was fuming. So they were out to destroy the record industry from right under my nose? Fat chance. When they came to the counter to make their purchase, I grabbed the little shit by his shirt. "So...you're going to copy this to your friends over The Internet, punk?" I asked him in my best Clint Eastwood/Dirty Harry voice.
"Uh y-yeh." He mumbled, shocked.
"That's it. What's your name? You're blacklisted. Now take yourself and your little bitch friend out of my store - and don't come back." I barked. Cravenly, they complied and scampered off.
So that's my idea - a national blacklist of pirates. If somebody cannot obey the basic rules of society, then they should be excluded from society. If pirates want to steal from the music industry, then the music industry should exclude them. It's that simple. One strike, and you're out - no reputable record store will allow you to buy another CD. If the pirates can't buy the CDS to begin with, then they won't be able to copy them over The Internet, will they? It's no different to doctors blacklisting drug dealers from buying prescription medicine.
I have just written a letter to the RIAA outlining my proposal. Suing pirates one by one isn't going far enough. Not to mention pirates use the fact that they're being sued to unfairly portray themselves as victims. A national register of pirates would make the problem far easier to deal with. People would be encouraged to give the names of suspected pirates to a hotline, similar to TIPS. Once we know the size of the problem, the police and other law enforcement agencies will be forced to take piracy seriously. They have fought the War on Drugs with skill, so why not the War on Piracy?
This evening, m
Re:iTunes is a monopoly (Score:2, Informative)
Re:iTunes is a monopoly (Score:2, Insightful)
... how? All I see is a tired old troll :/ Where's the funny / insightful that "satire" implies?
Re:iTunes is a monopoly (Score:3, Insightful)
"Yeah, dude, that's really lete [sic], you'll get lots of respect."
I was fuming. So they were out to destroy the record industry from right under my nose? Fat chance. When they came to the counter to make their purchase, I grabbed the little shit by his shirt. "So...you're going to copy this to your friends over The Internet, punk?" I asked him in my best Clint Eastwood/Dirty Harry voice.
You don't find that hilarious? Something is wrong wi
Re:iTunes is a monopoly (Score:3, Insightful)
They have fought the War on Drugs with skill
clearly shouldn't be allowed online without a minder.
that sucks (Score:5, Funny)
Parent
Re:that sucks (Score:3)
he said _I_ can't tell you.. not _tell me_.
but good job on the two character troll.
Re:iTunes is a monopoly (Score:5, Insightful)
Piracy happens because technology happens. We pirate music because it's easy to copy and considerably less than buying it. We don't pirate books because it's frankly too expensive in photocopying charges but there's a whole collection of pirated PDFs out there, if you care to look.
Technology changes the world we live in. I don't recall the Horse & Cart Association of America (HCAA) suing people that moved to cars which put them out of business. I also don't recall the MPAA or RIAA suing Intel, IBM or Microsoft for giving us these tools that enable us to pirate music.
If piracy destroys the music business, so be it. Technology often destroys antiquated business models whether it's children cleaning chimneys, horse drawn carriages, coal mining or farming by hand. These people need to find a business model that works. An artist only makes around 5% from every track sold, the label and distributors cream off the rest. That's unfair, IMO.
Why do we also need to have movie distributors for every corner of the world bidding for the distribution rights? Are we not one global market?
I think it's about time that the movie and music industries were overhauled as they've had way too much power and too much of a monopoly for too long. After all, we're not killing people here with this technology, we're just changing lives. We're just hurting the profit margins, I thought this is what happened in a capitalist and democratic society. Why do we in the Western world create these societies with freedom to innovate and freedom to make money but then try to shackle them when it starts to backfire?
Bring on the technology, lets keep changing the world!!!
Parent
Re:iTunes is a monopoly (Score:3, Insightful)
Ok... (Score:4, Funny)
Next, "iTunes really does play tunes!"
Re:Ok... (Score:5, Insightful)
Parent
Re:Ok... (Score:5, Funny)
And, of course, looking into the origins of said intelligence is blasphemy.
Parent
Re:Ok... (Score:3)
I am not sure I see what he sees (Score:3, Interesting)
So after analyzing all that data, how does Brian Hansen come to the conclusion that "it's simply the mind's tendency to find a pattern that makes you think iTunes has a preference". Uh, no. It's the software learning that you have a certain type of genre or style that you strongly favor and will selectively pick songs that are related, thus giving you a better-selected playlist.
And it seems that the program has a bug in that it will play a song twice in a row. That's a real bug (if you don't like that type of thing).
Re:I am not sure I see what he sees (Score:5, Informative)
Parent
Re:I am not sure I see what he sees (Score:2, Interesting)
If you do a static shuffling, i.e., a shuffle at the beginning of playback, and then trudge through the playlist that was generated then you will certainly get each song played the same number of times, and you won't get repeats. The only chance of getting a repeated song is if the last song of a shuffled playlist is the same as the first song of the next shuffled list, which is 1/n^2.
You can comb
Re:I am not sure I see what he sees (Score:2)
Obviously 1 song in the group is a bit of an exaggeration, it gets the point across. That idea wouldn't work :)
Re:I am not sure I see what he sees (Score:2)
No, it would be 1/n^2 if that song had to be a particular song. The probability of any song repeating, it's 1/n.
Re:I am not sure I see what he sees (Score:2)
I believe TFA said that he noticed that the same song was on the list twice, so obviously he could have gotten the same song twice in a row. Kind of a no duh moment there, but whatever.
Re:I am not sure I see what he sees (Score:2)
Reminds me of... (Score:5, Funny)
It became so annoying that I ended up removing the album from iTunes, at which point my iPod promptly died. The replacement was big on Roxy Music IIRC...
Re:Reminds me of... (Score:4, Interesting)
Rating For Morning Listening (* for Aphex Twin, Slayer, etc)
Rating For Afternoon Listening (**)
Rating For Evening Listening (****)
Rating For Party Listening (**)
Rating For ${mood} Listening
Then instead of getting work done we can spend out entire lives rating music.
Parent
Re:Reminds me of... (Score:4, Interesting)
1: Never play unless I explicitly say so.
2: Don't include in shuffle.
The first one I'd use to flag interviews etc. that are sometimes included on albums. Is not necessarily bad content, just something that you don't generally need to hear multiple times.
The second one is for flagging things like Beethoven's 9th. It's really good music, but you don't want 67 minute long pieces in a random playlist.
I currently just use the 1 and 2 star ratings for this, but it's not really ideal. It's too bad (but understandable) that iTunes has no option for looking at TXX frames [id3.org] or I could implement it in a better way.
Parent
Re:Reminds me of... (Score:4, Interesting)
Parent
Re:Reminds me of... (Score:5, Interesting)
* - Never play. It's only in the list for the sake of completeness (I hate having partial albums)
** - Play very rarely. If I'm in the mood, I might listen to it.
*** - I'll listen to it at least once a week. If it comes up randomly on the shuffle, I won't take it out of the list.
**** - I can listen to this several times in a day.
***** - I'll listen to this song anytime, anywhere. If it comes up twice in a row, no problem. If my playlist only has this song on it, I can cope with that for at least a few hours.
This means that I have to periodically re-rate the songs. That seems only reasonable, though. Why would songs stay at the same rating forever? As the novelty wears off, I can relegate a song to 4 or 3 stars.
I also keep extensive smart playlists that make sure that songs that are 3 stars or less only get played once every few days.
Parent
Interesting (Score:5, Interesting)
Why? Because I haven't got the time to go around rating my entire music library. Judging from that article, it is dangerous to only do a few because of the weighting algorithm used - surely it would be more sensible to assume that 'not rated' meant 3 stars rather than 0 stars? That way you could rate down shitty songs, and rate up excellent songs, but ignore rating the vast majority of songs.
Re:Interesting (Score:4, Interesting)
As for Gracenote: perhaps sales on the ITMS could act as a gauge of this. e.g. "This is this artist's most downloaded song and this artist compared to similar ones is bought 5x as much, so our algorithms suggest it should be rated 5" Then once you have downloaded it you can change it if you get the time.
Parent
Re: Try last.fm (Score:5, Interesting)
Add on top of that the ability to play a custom-built radio station, set it to play only new music or listen only to music from a particular user profile.
Linux and BSD supported! Open source plugins and radio station player! Could it get better?
---
but make sure that the last line
Generated by SlashdotRndSig [snop.com] via GreaseMonkey [mozdev.org]
Parent
Re:Interesting (Score:3, Informative)
I can't really say how well this works in practice, or which programs support it, because I don't use the feature myself. However, I suspect it would work better than an explicit rating system, much like bayesian spam filters work better than explicit
Re:Interesting (Score:5, Interesting)
Rate up and down others as necessary. OK, not the point that default should be doing this for you, but a quick fix if you want it to work that way.
If you already have songs rated then create a 0 star smart playlist and repeat.
Stuart
Parent
Re:Interesting (Score:5, Funny)
Parent
Re:You wonder why the music industry is mad (Score:3, Interesting)
Would I pay to have my music rated by an external algorithm?
Finally (Score:5, Funny)
Underlying formula (Score:5, Informative)
From their results, I'd venture a guess as to the underlying algorithm:
Each song is given a number of points equal to (rating + 1). Then the probability of the song being played is (song rating)/(total points).
Or, to put more succinctly:
prob(song) = (rating)/(n + sum(i=1..n)(rating(i)))That yields probabilities in the given test case of:
5 star - .285 .238 .190 .143 .095 .048
4 star -
3 star -
2 star -
1 star -
0 star -
Which is reasonably close to what the author found. Heck, if I were implementing that feature, it's what I'd try first...
Re:Underlying formula (Score:5, Interesting)
5 star -
4 star -
3 star -
2 star -
1 star -
Parent
the REAL underlying formula (Score:5, Interesting)
points(0 stars)=1
points(1 stars)=3
points(2 stars)=4
points(3 stars)=5
points(4 stars)=6
points(5 stars)=7
probability(X stars) = points(X stars) / 26
This yields the following probabilities, listed along side the observed values from the article along with 95% condience intervals.
p(5 star)=.2692 [.270 +- .0038] .0036] .0033] .0031] .0027] .0016]
p(4 star)=.2308 [.230 +-
p(3 star)=.1923 [.189 +-
p(2 star)=.1538 [.154 +-
p(1 star)=.1154 [.118 +-
p(0 star)=.0385 [.039 +-
As you can see each computed probability falls within the 95% confidence interval, so there's a good chance this is the correct forumla.
Boy do I have too much time on my hands today.
Parent
Why Assume a Bell Curve? (Score:5, Insightful)
You'd think, with iTunes, that people would be buying music they like (a four or five rating) in a much higher proportion than music they'd rate as a three.
Then there's music added from your own collection. Maybe its just me, but my preferences tend to be stronger than -, 1, 2, 3, 4, 5.
I usually go through my music collection on a regular basis and delete crap that I don't listen to, which is usually anything less than a three, and definitely a - or a one.
And is 4334 just a random arbitrary # of songs to use?
(when you add up X0 through X5)
I found it an odd statement too (Score:3, Interesting)
I mean, where is this statistic coming from?
In my case the majority of rated songs are 5's, almost the same number of 4's, then some 3's, and hardly any 2's or 1's.. with perhaps 50% left unrated. I use iTunes at least several hours a day. Those of my friends who use iTunes seem to have a similar distribution.
Re:Why Assume a Bell Curve? (Score:3, Informative)
When you look at data, particularly from modest numbers of samples, it seldom fits the bell curve nicely. Often it is multimodal.
However, the Guassian (bell curve) distribution has one important property: it is the distribution that (given a number of stipulations like having a finite integral over any interval, and there being no set uppler or lower limit to the variable) has the highest entropy. It is, in a sense, the most random of ran
Some calculations errors in my opinion.. (Score:5, Informative)
A way to calculate the odds that 2% will be played in the next 50 songs doesn't work 50* (2/100) = 100% as the author does, and neither 25*(2*100) = 50% is correct.
The correct calculations are: 1-(98/100)^50 = 63% and 1-(98/100)^25 = 39%.
This way you calculate the odds a song will be played at least once in the next 50 or 25 songs.
If you want to calculate the odds the song will be played exactly once in the next 50 or 25 songs:
50 * (2/100) * ((98/100)^49) = 37% or 25 * (2/100) * ((98/100)^24) = 31%.
I guess that's all..
From the article...trick of the mind (Score:4, Informative)
"Many claim to still see patterns as iTunes rambles through their music collection, but the majority of these patterns are simply multiple songs from the same artist. Think of it this way: If you have 2000 songs and 40 of them are from the same artist, there is always a 2% chance of hearing them next with random play. So right after one of their songs finishes, odds almost guarantee they will be played again within the next 50 songs and show a 50% chance they will play again within the next 25 songs. It's simply the mind's tendency to find a pattern that makes you think iTunes has a preference."
Re:From the article...trick of the mind (Score:3, Interesting)
For 4000 songs, that's around 64~ songs. So if your player chooses tracks completely randomly then 50% of the times you'll listen to 64 songs, you'll hear the same song twice from those 64.
Even if your player doesn't play the same song twice, if you have 8000 songs from 4000 artists, 2 songs per artist, then you get a similar calcula
Modal Music (Score:5, Interesting)
She said that research had shown that listeners would rate the same song higher if it followed other song of a similar genre. If they play songs of different genres randomly the listener does not enjoy the music as much.
So their tendency is to play "blocks" of music.
For example....
4 Classic Rock songs
3 Blues Songs
3 Folk songs
4 Female Rockers
3 Grunge
etc.
This is common knowledge in the radio world. I wonder if Apple has incorporated this type of logic into it's iTunes algorithms?
The radio station in question is WXPN and can be found under iTunes > Radio > Public > WXPN
Re:Modal Music (Score:3, Informative)
4 Classic Rock songs
3 Blues Songs
3 Folk songs
4 Female Rockers
3 Grunge
Not entirely true, and it depends on the station (as you stated). Some stations make it a point *not* to put songs with female lead singers together; *not* to put songs from the same R&B/Dance/whatever genre together; *not* to put songs from solo artists next to each other. And so on. And don't forget issues with playing more than one old song after another.
A
something i've noticed (Score:5, Funny)
Whisky Tango Foxtrot? (Score:4, Insightful)
Instead I read about some geek with way too much time on his hands. Yawn.
Dear Apple (Score:3, Interesting)
Why no-stars? Because that way the majority of the collection is unrated. Stared songs really stand out in a playlist. 1 and 2 star songs play less often than no-stars, while 3, 4, and 5 play more often. But I want my favorites to play much more often than your arbitrary algorithm.
Crunching the math on Slashdot (Score:5, Interesting)
But all this talk of 0, 1, 2, 3, 4, 5 has me thinking of another rating system. Would anybody care to do an analysis of the ratings in Slashdot comments? What are the relative populations (I expect a ton of 2's but how about the rest)? Do comments made in the first hour after a story is posted stand a better chance of reaching +5 than comments made later in the day?
One of my gripes about the Slashdot comment system is that it discourages contemplation and discussion. Comments made more than 24 hours after a story is posted are rarely read and almost never moderated. This is in contrast with comments system like Usenet or other bulletin boards, where threads can remain lively for weeks.
AlpineR
Car mp3 CD player (Score:3, Funny)
Re:Car has a "random" bug (Score:4, Funny)
Product: Audi S4
Component: CD Player
Status: ASSIGNED
Severity: Normal
Hardware: All
OS: All
Resolution: Not a bug
Summary: Car has a "random" bug
Description:
I have a certain CD that causes my Audi S4 (when set to random mode) to play the same track over and over and over. Guess somebody didn't prove their recurrence actually worked.
Solution:
CD contains only one track. Random mode functioning properly.
Parent