Hipsters, Fixie Bikes, and Elevation: Hipsters even more ridiculous than previously thought?

This post was recently brought to my attention via a Facebook post’s link to this nerd-famous blog. Reproduced here (stolen? though attributed) is the list of ranking by city of popularity of the fixie bike, which is intended to be interepreted as a proxy for measuring the relative hipsterness of the each of the cities.

As much as I like making fun of hipsters and their love of all that sucks and is impractical/absurd, I figured even hipsters only like something that sucks if it doesn’t suck that much. Of course, what sucks is relative. A spray of cold water to the face doesn’t suck that much on the 4th of July in Houston, but it really sucks if you’re a cat, even on the 4th of July in Houston. Thus, I hypothesized that the inability of the fixie bike to shift gears sucks a lot more in places where there are hills.

Unable to find a “hilliness index” (by city or otherwise), I simplified the and just took the difference between max and min elevation. What I expected was a negative relationship between the amount of climb one would maximally have to do in a city and this mysterious Fixie index. I was wrong.

 Could it be? Do hipsters love fixie bikes even more in places where it most sucks to have one? Are hipsters just drawn to cities with large elevation changes? Maybe hipsters just like mountains.

Clearly what is is driving this relationship are those points out to the right, Los Angles , San Jose, etc. Also, this index of hilliness is probably not that great, as it doesn’t differentiate between tons of hills and just one hill. What is needed is a better index of hilliness, which is now in the works.  So, stay tuned– I’ve got more to say on this once I find a way to quantify “hilliness”.

 

What in the holy hot hell hypnotizingly happened in my attractively adorable absence?

It comes to my attention that during my brief absence from reading celebrity gossip, things have gone terribly downhill. Courtney Stodden has appeared on the scene, and I would hope this would trigger any and all models of the time until the apocalypse to move their prediction up to about… tomorrow. I mean, seriously. I leave for a year and this is what happens?? We are now paying attention to insanely creepy 17 year old child brides, whose only talents are excessive makeup application and the ability to alliterate and use a thesaurus (only to look up synonyms for words like “lubricated” and “seductive”).

So, against my better judgment since I am not yet officially gainfully employed, I will share with you my latest time-killing project**. Stoddenify your speech!! Follow the link, type in a normal sentence, and get back a Courtney Stoddenified sentence… complete with occasional grammatical incorrectness and nonsensicality!!

** I used this guy’s code to determine the part of speech, plus a few modifications of my own.

I’m back!

You probably didn’t realize I was gone. That’s ok. Just pretend like you missed me.

Anyway, I’m fresh off of a quarter-life crisis induced year of international wanderings. In case you are wondering (again, just pretend), after my postdoc in Brazil, I stayed for a while and did whatever you do on the beach (absolutely nothing) for a few months. Then off to Germany for some climbing, followed by Colombia and Ecuador, and then up to California. Back down to Peru to do the Inca Trail with some buddies, and then a roadtrip around Europe in a rented car. (Sentence fragments aren’t bad if you’re blogging. Promise.)

So, in the interest of undoing some of the brain atrophy I’ve experienced over the last year, expect to see a new post every once in a while.

Brasil ranks 31st out of 44 in English profficiency

A few months ago, I did a post about my guess that someone whose first language is widely spoken would be less likely to speak English than someone whose first language is relatively obscure. It looks like I’ve been outdone.

English First has done a study that assesses the English proficiency of adults in various countries. From this, they have put together an English proficiency index and made some pretty nifty maps and plots.

The English First folks also investigated the same phenomenon that I did in my post. Clearly they have a much bigger budget (greater than $0) for doing these sorts of things, and they didn’t just cull their data from Wikipedia, so I tend to go with what they say. Good thing their results support my own– again, that people whose first language is shared by many are less likely to speak English. However, the relationship they found was “weak.” See below.

EF EPI

If you’re upset by the fact that the relationship here appears to be in the opposite direction of that which I found earlier, don’t be. I was looking at the negative log of the number of native speakers. Why I transformed the data like that, I don’t actually remember, but rest assured that this is showing roughly the same thing. Of course, this isn’t exactly the same thing, the most obvious reason being that they are looking at “English proficiency”, whereas I was looking at the “percent of English speakers.”

They also compare English proficiency to various other variables they believe should be related, such as  the value of exports per capita, the average number of years of schooling, and gross national income per capita. All of these had a stronger relationship to the English proficiency than the native speakers variable.

One last mildly interesting nugget of information, which was mentioned in the Brazilian article that pointed me to the English First study and website, is that all of the BRIC countries fall right in line. China, India, Brazil, and Russia took the 29th, 30th, 31st, and 32nd spots respectively. The article also pointed out that, although world wide Brazil did not do so well in this ranking, at least it beat Venezuela and Chile!

The Anne Hathaway Effect

I recently stumbled upon this article in the Huffington Post which claims that every time Anne Hathaway gets a lot of Internet attention (for releasing a movie, hosting the Oscars, or what have you), the stock price for Berkshire Hathaway shoots up. The author, Dan Mirvish, justifies the plausibility of this by saying that “My guess is that all those automated, robotic trading programming are picking up the same chatter on the internet about “Hathaway” as the IMDb’s StarMeter, and they’re applying it to the stock market.” 


The data they use to support the claim is that 

Oct. 3, 2008 - Rachel Getting Married opens: BRK.A up .44%Jan. 5, 2009 - Bride Wars opens: BRK.A up 2.61%
Feb. 8, 2010 - Valentine’s Day opens: BRK.A up 1.01%
March 5, 2010 - Alice in Wonderland opens: BRK.A up .74%
Nov. 24, 2010 - Love and Other Drugs opens: BRK.A up 1.62%
Nov. 29, 2010 – Anne announced as co-host of the Oscars: BRK.A up .25%



I think the first commenter put it well when s/he said 

“First!”

Nah, just kidding. Here’s what they really said:

This is junk statistics if I’ve ever seen it. There may be something to the automated trading idea, but these data are proof of nothing. How about the hundreds of other times Ms. Hathaway was in the news and the stock didn’t rise so dramatical­ly? How volatile is this stock normally? Are these percentage increases anything out of the ordinary?
Exasperate­d, I decided to d a quick test. I downloaded the BRK.A data from Jan. 1, 2008 to Mar. 18, 2011 from YAHOO Finance and did a trivial analysis of it in Matlab. Just looking at the difference between open and close prices, the stock was up 0.25% or more 308 times over this period. The stock was up 2.61% or more 47 times over this period. Those two percentage­s are the lowest and highest in Mr. Mirvish’s “data.”
As a scientist and math lover I’ve disappoint­ed to see this story making the rounds with so little skepticism­. It’s a statement for the level of understand­ing of statistics and probabilit­y by the general public.

Looks like I’m not the only mathbuster out there. 


My first complaint about this (and backing up commenter number 1) is that, as someone who does not follow stocks at all, I have no idea if a .74% increase in BRK.A is anything notable.  Having downloaded the stock prices since 2008 from Google Finance, I can tell you that it isn’t.  When Rachel Getting Married opened, the .44% increase was in the 68th percentile of changes in price… including negative changes. It was only in the 32nd percentile of positive changes. Even the biggest change of 2.61% is only in the 92nd percentile overall. Certainly not a tail event.  Getting to the point, it’s not like every time Anne Hathaway gets naked with Jake Gyllenhaal, the stock holders all go out and by themselves a brand new G6. It’s a pretty normal fluctuation. 

Over the period from 2008 to yesterday, the stock increased about 47% of the time. Since we are apparently completely disregarding the magnitude of the change, the probability of getting all positive changes when randomly selecting 6 dates out of the 828 trading days is quite small. But what would be the chances of looking at, say, 10 different dates and finding that 6 or more of them are positive?? If we ignore the issue of replacement (which shouldn’t be horribly important since the sample size is 828 and we are only sampling 10), the probability of getting exactly 6 is about 18%, and the probability of getting 6 or more is about 31%. 


Given that the hypothesis is that the stock price is getting this little upward nudge because of Internet chatter, I checked out Google Trends to find other likely dates that the stock should increase under this hypothesis. Luckily, Google even shows you what the major news stories are on some of the major peaks, so it is easy to figure out the date.

Google Trends for Anne Hathaway
The top line is search volume and the bottom is news volume. They pick out many of the same spikes.
Two big peaks we see on here that haven’t already been accounted for in the original post are B, Anne Hathaway Proclaims Love For ‘Family Guy,’ ‘Aqua Teen,’ Fulfills Nerd Vision Of Idealized Woman, on February 23, 2009 and CAnne Hathaway spends spare time studying physics, on February 2, 2010. On these two dates, BRK.A saw a 1.82% and .11% decrease respectively.  Further, when on June 20, 2008 the Los Angeles Times posted a story called Anne Hathaway versus Jessica Alba  resulting in the very visible spike in 2008 (I guess everyone likes a good ladyfight), BRK.A experienced a -.79% change. On the opening day of Get Smart, June 20, 2008, BRK.A fell .79%, and if we go back just a little bit further to December 9, 2005, the day that Brokeback Mountain had its major opening in the US, BRK.A dropped .07%. In fact, the sample correlation between Anne Hathaway’s Internet search traffic and the price of BRK.A for 2008 to yesterday was just .01– basically uncorrelated.** 


Given all of this, I’m really hoping that Dan Mirvish didn’t run out and by up a bunch of BRK.A hoping that his post would force the price up a bit. :)  


**This, of course, does not rule out the case that the fancy trading algorithms only act based on spikes in search volume, not normal activity, but just sayin’… 

Text me where the buildings are, and I’ll tell you where the building damage is.

Back in October 2010, Patrick Meier posted an article called How Crowdsourced Data Can Predict Crisis Impact: Findings from Empirical Study on Haiti on his blog, iRevolution. It might be worth your time to go skim that really quickly if you want to get the biggest bang for your buck as you continue reading this… go ahead, I’ll wait.

If you did your homework, you already know that in his blog post, he recaps some pretty interesting results from a  team at the European Commission’s Joint Research Center (JRC). The researchers who did this study were very awesome and sent me the original paper along with some hints as to how they did their analysis. If you want the paper, which appears in Conference Proceedings from the 2nd International Workshop on Validation of Geo-Information Products for Crisis Management, you’ll have to track down the proceedings. Alternatively, you can watch the presentation video.

Meier wrote that the JCR team used the SMS reports mapped on the Ushahidi-Haiti platform “to show that this crowdsourced data can help predict the spatial distribution of structural damage in Port-au-Prince“.  The SMS messages they use were collected starting just four days after the disaster and were sent by Hatians with their “location and urgent needs.” Through the magic of spatial statistics, these researchers show that they are able to predict the locations of building damage using the SMS data. They point out that in the event of an emergency such as the Port-au-Prince earthquake, this sort of prediction would be very useful because it is cheap and real-time. You don’t need a small army of  ”some 600 experts from 23 different countries” and the World Bank to assess detailed satellite imagery to pinpoint the damaged buildings. All you’d really need is a much smaller sample of damaged buildings with which to correlate the SMS data, and voila! As you get more SMS data, you would be able to predict where more building damage is (read: people needing help are).

Let’s start by taking a look at some of the figures from the paper that support this claim.  Figure 1 (in this blog, Figures 4 and 5 in the paper) shows a derivative of Ripley’s K-function, which essentially determines whether same-type events (top row) or different-type events (bottom row) can be said to cluster together at various distances. Remember that this paper’s main idea is to show that  building damage is clustered near SMS messages. One type of event is a SMS message, and the other type is a highly damaged building, as judged by the previously mentioned “experts”. The data are the locations of each of these types of events across a 9km x 9km square that comprises the city of Port-au-Prince. The horizontal axis, across which this L function is calculated, represents the distance between the location of events. The green lines are 80% confidence intervals. In a nutshell, if the black line (the calculated L statistic) falls above the green line at any point, then we are to think that within this radius around any given event, events of the same type (top row) or different type (bottom row) are more likely to occur. So, for example, if we look in the bottom right plot of Figure 1, we find that for radii between about 1000m and 3000m from any SMS message, we are likely to find a higher-than-average number of damaged buildings. Hence the usefulness of the SMS messages in this situation.

Figure 1: L statistic from original paper

But, let’s think about this for a second. Does it really make sense that this would be the case for a radius of 2km but not 500m? That is, would it really make sense to believe that people are texting for help 2km away from major building damage but not right near the site? Sure, I guess I could buy that. I suppose it could be the case that people very close to the damaged buildings are either dead or incapacitated and thus unable to send SMS messages. I wouldn’t expect this to be the case up to a kilometer away from the most damaged buildings, but I’ll go with it for now. Secondly, how useful is it to know that there are likely to be damaged buildings within a 2km radius of any text? If we assume that we don’t already have a good idea of where buildings are without the text messages, my high school geometry tells me that this 2km radius implies an area of about 12 and a half square kilometers in which we blindly search to find the expected extra building damage. Even subtracting off that inner radius, where there is not likely to be extra damage, we’re still left with almost 10 square kilometers. Again, I’ll go with it. Maybe the information from all of the text messages combined gives more practically useful information.

The most convincing graphic from this paper (labeled as Figure 7 from their paper, and Figure 2 in my blog) is that which shows the observed density of building damage next to the predicted building damage density given SMS messages.  Yep, I agree that this passes the eyeball test. It does look like SMS messages are doing a pretty good job of sniffing out building damage.

Figure 2: Predicted and observed building damage density from original paper.
Alright, now let’s take a closer look. I also got a hold of the larger data sources used in this analysis. Because the paper does not list the exact boundaries they used to define Port-au-Prince in their data set, I tried to recreate their data set based on the number of events they reported to have included in the analysis and guessing what the boundaries of their plots were by finding landmarks on a map. After many hours of trying to find a subset of these larger datasets to match SMS and building damage data sets used in the above analysis perfectly, I emerged with something that is hopefully sufficiently similar.  First, because I will be doing some statistics and thus no one will trust me (thanks a lot, Mark Twain), I reproduce the above plots using my datasets. Although it looks like I cut off a little bit of space over on the right when trying to match their dataset, for all intents and purposes, I think I’ve got the same thing. They’ve got 1645 SMS messages, and I’ve got 1651. They use 33,800 damaged building locations, while I use 33,153. Although the plots that I have reproduced (Figures 3 and 4) are not *exactly* the same as those presented in the paper (above), I think they are similar enough to conclude I am doing the same thing they are given that the datasets are slightly different and some of these plots require some tuning parameters. I’m satisfied.
Figure 3: My reproduction of the L statistic plots that appear in the original paper using my dataset.
Figure 4: (left) Fitted conditional density of building damage given SMS messages. (right) Observed density of building damage. Both of these plots were produced from my datasets and are intended as reproductions of the plots in the original paper.

My first main question upon reading this paper was whether these text messages were specifically picking out damaged buildings or whether they were simply finding areas of high building density. After all, people send the text messages and people do tend to be in areas with lots of buildings. I re-ran the same analysis with a random sample of 1000 buildings. This is as opposed to the previous plots which were run with a random sample of 1000 damaged buildings. Proceeding with their 80% confidence interval convention,  I find very similar results. For radii of about 1.5-3km, SMS message locations correlate with building locations, not just damaged building locations. Further, according to the infallible eyeball test, it seems that the SMS data is doing a good job of finding all of these buildings. (Figures 5 and 6)

Figure 5: L statistics for SMS messages and a random sample of all buildings.
Figure 6: (left) Fitted conditional density of buildings given SMS messages. (right) Observed density of all buildings. 

So, what’s going on here? My initial reaction was “Blimey! These text messages are just picking out buildings, not damaged buildings!  Damaged buildings can only occur where there is a building, and because text messages correlate with buildings themselves, the correlation between text messages and damaged buildings is merely an artifact!”  After some quiet introspection,  I realized that I may have jumped the gun.  Because we only used the trusty eyeball test, we haven’t looked at whether text messages do a better job of picking out the specifically damaged buildings than they do any building at all.

For my next trick, I run a Poisson regression. Following the original paper, I bin the data into a 30 by 30 grid, counting up the number of total buildings, damaged buildings, and SMS messages sent in each grid square. A quick diagnostic plot of the total counts versus damaged counts indicates that there is a pretty good linear relationship between the two–  the number of damaged buildings in any square is approximately a constant times the total number of buildings in that square. Although I am hoping with all of my might that my PhD advisor does not read this and find out that I did not use a formal (Bayesian!) spatial model to handle this clearly spatial data, I simply ran a few Poisson regressions to see if the SMS data really is adding anything beyond what we already know from the building counts. In my experience, incorporating a spatial model in the regression would only serve to reduce the significance of the covariates anyway.  I fit the linear model

Damaged Buildings ~ Poisson( exp{ b0 + b1* SMS  + log (Total Buildings + 1)). (Model 1)

This model includes one plus the total number of buildings as an offset. Adding one simply serves to eliminate the computational problem of taking the log of zero.  As discussed in the Wikipedia article linked to offset, this is often used to control for a baseline, in this case the total number of buildings in a square. The results of this regression are

Call:
glm(formula = damcounts ~ offset(log(allcounts + 1)) + textcounts,
    family = poisson(link = “log”))
Deviance Residuals:
    Min       1Q   Median       3Q      Max
-16.002   -3.074   -0.646    1.324   21.507
Coefficients:
              Estimate Std. Error  z value Pr(>|z|)  
(Intercept) -1.5669207  0.0061123 -256.353   <2e-16 ***
textcounts  -0.0024470  0.0009817   -2.493   0.0127 *

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for poisson family taken to be 1)
    Null deviance: 23336  on 899  degrees of freedom
Residual deviance: 23329  on 898  degrees of freedom
AIC: 26256

For those of us not used to reading R output, look at  the number to the far right of “textcounts”. While the coefficient on the number of text messages is significant, the sign is in the opposite direction as expected! Having text messages in any grid square results in a prediction of fewer damaged buildings! Could this be that before sending text messages, the people sending them moved away from the damaged buildings for safety reasons? 
Next, I suspect the areas of high building density, have a higher percent of  damaged buildings than areas of low building density. Imagine that in a dense area, one building falling could cause damage in others, whereas in a less dense area, this would be less likely to happen. To attempt to control for this, I ran another regression in which I include an additional covariate that is just the total number of buildings in the square. That is,
Damaged Buildings ~ Poisson( exp{ b0 + b1* SMS  + b2 * Total Buildings +  log (Total Buildings + 1))  (Model 2). 
The results from Model 2 show that the number of text messages are not significant at the magical 95% significance level. 

Call:
glm(formula = damcounts ~ allcounts + offset(log(allcounts +
    1)) + textcounts, family = poisson(link = “log”))
Deviance Residuals:
     Min        1Q    Median        3Q       Max
-16.2803   -2.6896   -0.5842    1.3627   19.3989
Coefficients:
              Estimate Std. Error z value Pr(>|z|)  
(Intercept) -1.768e+00  1.159e-02 -152.58   <2e-16 ***
allcounts    3.794e-04  1.792e-05   21.18   <2e-16 ***
textcounts  -1.851e-03  1.006e-03   -1.84   0.0657 .

Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 23336  on 899  degrees of freedom
Residual deviance: 22889  on 897  degrees of freedom
AIC: 25817
 Lastly, and I won’t show the output this time, if we ignore the offset completely and regress the number of damaged buildings on the total number of buildings, the square root of the total number of buildings, and the number of text messages, we find that the coefficient on the number of text messages has a p-value of .41– far from significant… even at the 80% level. The rationale for this was simply that some exploratory data analysis suggested that the square root of the total number of buildings might be a good predictor of the number of damaged buildings. From a geometric point of view, if the streets within a square are themselves arranged in a grid, this would be approximately the average number of buildings per street in that square and could maybe proxy for density. 
  
For the non-statisticians in the crowd, what this means is that given just the number of  buildings in a square, the number of text messages sent from within that square  is not an important factor in determining the number of damaged buildings! So, although text messages may be useful in identifying locations with buildings, if you already know where the buildings are, the text messages are not particularly useful (in this particular case) for figuring out how many of those buildings are damaged. Assuming that a crisis response team could more quickly access maps of building density than even the SMS data, ignoring the SMS data could lead to an even faster and cheaper response in this case.
At this point, if you are paying careful attention, you may think that I’ve missed the point. We did already show that for small radii, text messages are not correlated with building damage. The approximate 0.15km radii within each box are certainly under the threshold for which we wouldn’t expect to see any relationship between text messages and building damage under the original analysis. We already knew that, but I think this is a more formal way of making the point that building locations may be enough to find damaged buildings.  
To conclude, one of the main advantages presented in the blog post was how much time and money using SMS messages to find damaged buildings could save. Crowdsourced data may have its uses, but for finding damaged buildings for the case in Haiti, I’d like to propose an even cheaper alternative: a few statisticians, a map, and some coffee.


**** Data obtained from UNITAR/UNOSAT. 

Stop the presses! Psychic Phenomena are Real!!!!

Now, this might be the coolest thing ever! Some researchers claim that they have conducted experiments that show that psychic phenomenon (pre-cognition, i.e. telling the future!!!) exist. Here´s the article that alerted me to this (which was sent to me by one extra-special Craigory Craig, who I won´t link to because he´s a professional now or something), and here´s a pre-print of the paper.

To begin, this is by far my favorite sentence from the paper:

After responding to two individual-difference items (discussed below), the participant had a 3-min relaxation period during which the screen displayed a slowly moving Hubble photograph of the starry sky while peaceful new-age music played through stereo speakers.

Why am I not surprised that this was the set-up researchers in this field would choose? I must be psychic.

In the above patchouli-scented experiment, they present the participants with two doors to choose between, one of which had a picture behind it and the other had nothing– sort of like Let´s Make a Deal / Monte Hall game except instead of a car, you are rewarded with a picture of people doing it, and instead of a goat, you just get a blank screen. No, seriously, some of the pictures that were behind the curtain were “erotic pictures” (i.e. people doing it). The awesome thing here (if you have the sense of humor of a 13 year old boy, much like I do) is that people were able to guess with statistically better than 50% accuracy which curtain the picture was behind… as long as it was an erotic picture. My first thought is that this sort of psychic power explains why I miraculously turned up at my dorm room pretty much every time my freshman year roommate wanted it to herself. The force is strong with this one.

In another section of the paper, they talk about retroactive priming.  Each person was asked to indicate whether a picture was pleasant or unpleasant. In the retroactive experiment, a word was then flashed on the screen that was either congruous or incongruous with “pleasant” or “unpleasant”. In the plain vanilla version, the  priming word was flashed first. In these experiments, we´d apparently expect to see that it takes a person longer to select “pleasant” or “unpleasant” if the prime was incongruous with what they were trying to choose, and I guess this has been shown in forward priming experiments. Between pictures, a photograph from the Hubble telescope again made an appearance… because apparently photographs from the Hubble telescope are to psi-sense as sorbet is to tongues.

So, here´s what I´m thinking:

Why are people only able to have pre-cognitive powers related to erotic images? Is this what the researchers set out to prove in the first place? If not, it seems that one could partition the pictures into categories such that one of the categories proved statistically significant. I actually don´t think they were being dishonest in that way, though. Just sayin´.

Certainly there have been other priming experiments done in the past in which a series of primes and pictures were presented without the delicious raspberry Hubble telescope in between. If retroactive priming is real, could they not re-analyze those old studies to see if the retroactive priming effect was present when it was not the explicit purpose of the study? It would be awesome if it were, as evidence of this would have just been sitting around waiting to be discovered.

If it´s not, I am actually not so quick to take that as evidence that these sorts of psychic abilities are´t real. Could that not be evidence that people have psychic abilities that lean in the direction of pleasing the experimenter by confirming the hypothesis of the study, even if the hypothesis was unknown to the participant? I mean, shit, if they were psychic enough to know what the word was before they saw it, they ought to be psychic enough to know what the experimenter was trying to get at. And, how crazy would that be??? That would certainly call into question all designed experiments in psychology, as effects could also then be attributed to the participants´ inclination to confirm the hypothesis, even if the hypothesis was not disclosed.

In any case, this is not a math-busters style post. I´ll leave the replication of this study to the ghost-busters / psychologists. Until then, I´ll be eagerly waiting to see if this ends up getting busted…

 So, what do you think? Do psychic phenomena exist? If you don’t believe this, how much evidence would you need to overcome your prior?

Daylight Savings Time!

The only way I can ever remember which direction Daylight Savings Time changes the time is with the saying “spring forward, fall back.” The fact that the direction of the changes is dictated by the season (i.e. how early the sun rises and sets) should have made it obvious what would happen with the time in the southern hemisphere relative to the northern hemisphere. In fact, I never stopped to think about this until… yesterday.

When I arrived in Brazil on October 6, I was one hour ahead of the US’s east coast. One day, I woke up,  my cell phone time had sprung forward, and I was magically two hours ahead of the east coast. On Sunday, the east coast fell back, and I am now three hours ahead.

This is not earth-shattering news. It’s just kind of weird. I’m guessing that this has never occurred to most people who have not switched hemispheres or do not work with people in the opposite hemisphere.

So, now you know.

Joint Probability of Being Mauled by a Bear and Struck by Lightning

This is an oldie but a goodie. A while ago, Ms. Sarah Bailey posted this article on my Facebook wall about a guy who got struck by lightning and mauled by a bear. They go on to say that the closest estimate of the probability of both of these things happening is zero. Agreed… for any random person.

Every person, of course, does not have the same probability of being hit by lightning and being mauled by a bear. Take Donald Trump, for example. While Zeus probably hates him for being the most pompous shit ever, thus making him about 1,000 times more likely to be hit by lightning than the normal person, I’d hazard a guess that he is rarely if ever within 100 miles of an un-caged bear.

On the other hand, look at Rick Oliver. According to the article, “he tends to piddle about his farm, checking on his chickens, working on his tractors and, as he was in the wee hours of June 3, fixing up his Chevy Malibu.” It was while piddling that, upon hearing a mysterious noise off in the distance, he went alone to investigate. I’d say that sort of behavior makes you pretty darn likely to be mauled by a bear. It might also make you pretty darn likely to get struck by lightning if that same tendency to investigate noises outside also applies to thunder. 

Two points here: (1) these events are not independent. They are probably conditionally independent given a number of factors, such as rural-dwelling, gender==male, a love of Kenny Chesney, etc.  (2) If you meet several of those conditions (i.e. if you’re the sort of person who goes looking for bears/lightning), as rare as occurrences of bear maulings and lightning strikes are in the overall population, I’d say you’re fairly likely to be attacked by both.

Harm Caused by Animals

Possibly due partially to my most recent post re: personal alcohol expenditures, several people have sent me this few days old link, Harm Caused by Drugs, from The Economist. They show a plot of the relative harm caused by various drugs, both to society and to the individual. Alcohol ranks first. I guess I’m effed. 
While I guess it’s cool, what I keep pointing out is that as far as I can tell, what they are plotting is not data on { mortality / crime / loss of dignity / accidental pregnancy / increased probability of jumping naked on a trampoline } that can be attributed to use of the drug. They are plotting some “drug-harm” experts’ opinions on how much harm each drug causes. I’m certainly not saying that these people’s opinions aren’t valid, but how can the experts even assign a number to this? I actually looked at the summary of the study, and they are not giving rankings; they are coming up with these numbers based on weighting several different sub-categories of personal/societal harm. What is one unit of harm? How do you come up with the weights? Are harm to self and harm to society additive like this plot suggests? 
Also, the way this is phrased makes it seem as though this is a score of the intrinsic potential harm caused by the drug. I have a hard time believing that alcohol is fundamentally more harmful than, let’s say, crack cocaine. I think what got alcohol it’s primo number one ranking is the fact that it’s so common.  
In this same spirit, I thought I would plot the harm caused by various animals according to an expert on the subject: Napoleon Dynamite. Each animal is ranked based on the harm it can cause to people due to natural fierceness and supernatural magic skills. Each of these is of course comprised of several subcategories, which were weighted according to their importance in determining overall  potential harm.   

On a related note, WTF, California??