August 11, 2015August 11, 2015 by ihersey

Ian, Kona is wondering where you are…

So goes the title of the second such email I’ve received this year from United.

The thing is, I have a trip booked to Kona already…on United even.

Last time I got an email with this subject line, I had two trips booked to Kona on United (one for the Hawaii 70.3 race back in May, and one for Kona in October), and they were in fact the only upcoming trips I had at the time.

Somewhere at United, the left hand doesn’t know what the right hand is doing.

As ridiculous as this scenario seems, it happens all the time, and the problem isn’t unique to United Airlines. Marketing automation systems attempt to do personalized campaigns in order to get a better response rate. But they lack two central ingredients:

Current situational awareness from other transactional systems. United’s reservation system and all ancillary systems, such as checkin systems, frequent flier account, etc, aren’t one monolithic database, but rather a series of systems that need to update one another. This doesn’t happen instantaneously (this is why your frequent flier account doesn’t have your most recent flight on it as soon as you land); if every system updated every other system instantly, everything would bog down and grind to a halt – at least in the IT infrastructure on which a lot of these systems are currently built. Everyone wants “real time,” but most often the cross-system updates are still scheduled in large offline batch operations. (A good lesson on the IT complexity of airlines is talked about here and other places with respect to the merger of United and Continental.)
Intelligent, adaptive systems that take as much current customer context into account as possible. This is the holy grail of personalized marketing, but more often than not, campaign management systems are rule or model driven, and they only look at a few factors, such as when I made trips in the past and to where. The timing of my Kona emails is no surprise when you think about my past history: I’ve done the Hawaii 70.3 race in May a grand total of seven times since 2006, and this is my fourth Hawaii Ironman in October. The fact that I’ve already got reservations is a comical illustration of point 1 above, but anyone who knows me, either personally or through my various social networks, would know which races involving travel I’ve already signed up for, and combining that intelligence with my current transactional status would prevent emails like this that make it seem like the airline doesn’t know me, when in fact I’ve flown well over 2 million miles on United. They should know me very well by now!

In fact, if they were smart, they’d be making me offers for business class fare to Klagenfurt, Austria next summer, seeing as (a) I’ve signed up for Ironman Austria, and (b) I likey my business class as long as you give me a reasonable deal. True personalization like this is something we’ve worked very hard on a Saffron with our autonomous learning platform, and we have some very good proof points in accuracy of individualized product prediction. We co-presented on this with our customer USAA at Wharton back in April. “Cluster of one” personalization is just in its infancy, but with the IT industry on its way to adopting technologies that solve the two challenges above – low-latency updates from transactional systems and intelligent systems that can treat all of us like the individuals we are – we shouldn’t be surprised to see the marketing of the future become “scary good.”

Meanwhile, back in the present, two days after I booked my trip to Phoenix for this November’s Ironman Arizona race, I got this:

<sigh> Really, United?

February 16, 2014 by ihersey

Aggressive Drivers: Time for a “Page of Shame”?

The San Francisco Peninsula is a great place to live and train; the cycling in the Santa Cruz Mountains and along the coast is hard to beat (especially if you like climbing). But the one fly in the ointment in training paradise has been the few drivers and motorcyclists who make cycling not just unpleasant, but downright dangerous — and sometimes fatal.

Today was a running day, and as I was heading home from Woodside Elementary School (where we park to go running in Huddart Park), a Mini Cooper turned onto Woodside Rd. from Mountain Home Rd. without stopping and proceeded to tailgate the SUV directly in front of him. The SUV had another car in front of him and was going the speed limit. Nothing really out of the ordinary, though, until the Mini driver started “slaloming” in the line, alternately cross the centerline and going into the bike lane. He almost took out a cyclist.

Then he continued weaving, even trying to pass the two cars ahead by going right into the bike lane. Luckily there were no cyclists there at that moment, but he got stopped at the light at I-280, where Woodside Rd. splits from one lane in each direction to two. I ended up right behind him and snapped his photo:

He wasn’t pleased — he got out of his car (as did his passenger) and started threatening me. Luckily, I had witnesses all around, so he thought better of escalating things any further. I would have been happy to discuss things with a county cop present.

The point of this is not to single out this one guy (though he’s definitely one of the most aggressive drivers I’ve ever encountered, especially in a town like Woodside, which is usually crawling with cops on weekends); the larger question is how to use the power of social media to rally cyclists around identifying these dangerous drivers and doing something about them. They are literally life threatening.

I’m reminded of a great early website from the mid 90s: The Highway 17 Page of Shame. It was entertaining reading back then, and all of us have experiences like this at one point or another. But with all of the technology available today — camera phones and small video cameras with amazing resolution, plus the social networks to distribute the content to a wide audience — maybe it’s time to put that to use to ferret out not simply those drivers that annoy us but rather those who recklessly put our lives at risk.

June 18, 2013June 18, 2013 by ihersey

Can social media predict tonight’s winner of “The Voice”?

Earlier on in the season, I’d have said an emphatic “no” based on the correlation (or lack thereof) between buzz on Twitter, positive buzz on Twitter, number of followers on Twitter, week-to-week growth in followers on Twitter, Klout score — and just about anything else I could measure — and the actual results. The past couple of weeks, however, have gotten slightly better results, and that could be for several reasons:

Better odds in general as the number of contestants declines.
More focused social media discussion of the remaining contestants — i.e., less “noise” — overlapping better with voting behavior.
More audience engagement as we hit the semi-finals and finals, similar to what we see with big-time sports (e.g., Super Bowl, March Madness, NBA finals, Stanley Cup raise the volume of discussion over what you see in the regular season).

Remember, too, that we are using Twitter buzz only as a proxy for voting behavior — tweets don’t actually count in the voting. What does count is telephone voting, SMS voting (but only if you’re on Sprint), Facebook voting and iTunes downloads, this last category having a disproportionately large effect with sufficient volume. We do not have access to any of the metrics for these sources, so the experiment has been to see how well buzz on Twitter correlates with the outcomes each week. So this is the final week of the experiment for this season.

So let’s look at overall Twitter buzz:

And now restrict the view to only “positive” buzz:

This does change the order between Michelle and Danielle if we look at the Top Entities view; the Swon Brothers are a distant third in both sets of metrics. We could posit that the Swons’ fan base doesn’t tweet, but that actually runs counter to generally observed engagement levels — the brothers are very active on Twitter compared with the other two contestants, not just with posting but also with replies and retweets. Perhaps two is better than one for keeping the chatter high.

Another interesting phenomenon became apparent in the buzz charts if we look at Top Hashtags: Michelle has a hashtag, #4eyesontheprize, that didn’t exist when I set the topic up, so this tag is not being counted in her totals under Top Entities. This means that her numbers are potentially higher than the Entity chart shows, though only for tweets that contain that hashtag and no other reference to her. So we can’t simply add the two numbers together. There’s also another telling phenomenon in that in overall buzz, #teamusher beats #teamblake, even though Team Blake has two of the three finalists. However, those numbers look vastly different if we just look at positive buzz. The thing about the sentiment coding here, though, is that it favors precision over recall, which is a somewhat technical way of saying it undercounts rather than overcounts. So when push comes to shove, I’m going with the old Oscar Wilde adage: “the only thing worse than being talked about is not being talked about.” — any buzz at all trumps positive buzz.

In that vein, however, we do have to remember that Danielle has consistently out-buzzed Michelle in many of the other weeks, and also that Danielle has more than 40% more followers than Michelle. On the other hand, follower counts have not correlated at all well with actual results, neither for “The Voice” nor for “American Idol”. Also, this week’s overall volume is significantly higher than in those previous weeks, so we’ll assume it follows the engagement trend of other finales, in which overall votes or other forms of engagement tend to correlate with viewing audience.

Anyway, it would appear the numbers are telling us that the Swon Brothers get third, and that there’s a somewhat conflicted picture of whether Danielle or Michelle wins, but I’m going with the overall buzz numbers and calling it for Team Usher’s Michelle Chamuel.

One last side note: it appears that “Voice” executives finally ponied up to Twitter, as all finalists’ Twitter accounts are now verified. In Twitter-speak, that means they now count as celebrities. 🙂

May 16, 2013May 16, 2013 by ihersey

What Did We Learn from This Week’s “The Voice”?

Yesterday I looked at a number of different cuts of social media data and found a number of candidates for the Bottom 2 on “The Voice.” The interesting thing about the actual results was that none of the data I had access to correlated with the outcome, at least in a discernable pattern that you could build any sort of rule or model off of. I could try using a stats tool like R to find non-intuitive features that better predict an outcome, but that is probably like using a bulldozer to pound in a nail — there aren’t enough samples to make that approach meaningful. So I’m left with hypotheses:

Twitter isn’t a good proxy for voting behavior in the case of this show due to either a demographic skew or to the fact that the audience gravitates in general to other channels of interaction.
I haven’t found the right features yet. I used growth in Twitter followers as one potential proxy for “momentum,” and that correlated with one of the results — Vedo had by far the lowest percentage growth in followers week to week (he started of course with the highest number, so it gets harder to maintain the same growth rate), but in the case of the other ousted contestant, Garrett Gardner, we saw the highest percentage growth in followers, so that piece of data doesn’t correlate at all. Garrett, in fact, by any of the numbers I had access to or derived, should still be on the show. (Though I personally found his vocal performance on Monday weak, despite an interesting arrangement.)
The vote tallies among the bottom half of the contestants are actually fairly close (which the counts I had showed as well), so until we get to a point where we see either much greater volume or much greater differences between contestants, the best we’re going to do is random guesses over a larger pool of contestants with similar numbers. In other words, absolute rankings don’t work until you start to see larger gaps.

One thing is for sure — pure follower count is meaningless in this show (as it was in last week’s “American Idol”). A lesson for all who use follower count as a proxy for influence.

May 14, 2013 by ihersey

Social Media Numbers for “The Voice” Top 12

Last night’s “Voice” episode featured the Top 12, and for the first time this season, audience votes and downloads are the only determining factor for who goes home tonight. Like last week, I’ll be looking at Twitter numbers as a proxy for voting behavior so that we can see the extent to which opinions expressed there correlate with results. The voting methods as explained on the show were phone, text, Facebook and iTunes download, so Twitter is an independent factor in this equation and has no direct bearing on the outcome.

Last week, the Twitter-derived numbers correlated reasonably well. Not perfectly, though. For Team Adam, I hadn’t set up the topic in time, so had about 1/5 of the numbers I had for the Team Blake and Team Shakira contestants. So there my numbers were just plain off. The other factor was the judge’s discretion, and Blake elected to save the Swon Brothers rather than Justin Rivers. Welcome to the big leagues!

I personally enjoyed many of last night’s performances, but what’s important is what those who might vote thought. I took a couple of different cuts at the data. The first is simply looking at “share of voice” from the period of the airing of the show (starting in the Eastern time zone) through late this morning Pacific time. Here’s what we get:

Simply taking the ranking from most mentions to least during this time period, we end up with a Bottom 2 of Kris Thomas and Michelle Chamuel. We could look at the data another way, though: if we filter by positive sentiment — the idea being that people who express positive opinions are more likely to vote (or perhaps “you vote FOR someone, not AGAINST someone else”) — then the rankings change a little:

Now Amber Carrington and Holly Tucker make up the Bottom 2. However, note that the overall numbers aren’t very high when we apply this filter (sentiment analysis isn’t an exact science by any means, and the software used here is biased towards precision over recall, so is somewhat conservative.

Confounding the data further is iTunes, which doesn’t give precise data but does provide a “popularity” meter. If we look at each of the above sets of Bottom 2, we do see that Amber and Michelle’s performances did not max out the popularity meter, whereas Kris’ and Holly’s did.

One other data point we can look at is a week-to-week difference in Twitter followers. If we rank by percentage gain first, then by overall number, we get:

Twitter handle	FollowersLastWeek	FollowersThisWeek	Delta	Percentage
@garrettgardner2	16893	24426	7533	45%
@michellechamuel	14872	21305	6433	43%
@dbradbery	32352	45723	13371	41%
@josiahhawley	29100	40894	11794	41%
@theswonbrothers	10644	14750	4106	39%
@hollytmusic	10232	13454	3222	31%
@ambercarrington	11274	14520	3246	29%
@kristhomasmusik	9721	12130	2409	25%
@sarahsimmusic	20826	25314	4488	22%
@sashaallenmusic	13735	16674	2939	21%
@judith_hill	36078	39953	3875	11%
@vedothesinger	59217	61606	2389	4%

Based on this, it looks like Michelle picked up a lot of followers, so I’m going to have to go with Kris and Amber as the ones going home, with a possibility that Holly ends up somewhere in there. But I’m not taking it to Vegas — the differences aren’t large. Tough call this week!

Enjoy the show!

May 8, 2013May 8, 2013 by ihersey

It’s Finals Season Once Again — Prediction Time for “The Voice”

I was planning to enjoy a few months as a professional triathlete — well, that’s what my wife calls my vacation from work — but a little issue somewhere in the left glute / piriformis / hamstring has curtailed my running, so I find myself back in the world of social analytics, but totally for fun. And what’s more fun than finals season on “American Idol” and “The Voice”?

I’m often asked about (and asked to speak about) the predictive power of social media analysis, and I always tell people it’s more of an art than a science. Also, the predictive power of social media numbers varies according to what you’re trying to predict. Events that are rooted in popularity, though, correlate pretty well with numbers you see in social media, and that’s especially true where the events have a voting system that isn’t just one person, one vote (like political elections). Hence, televised singing contests where the audience votes.

We’re well into “Idol” right now — we’re down to the Top 3, one of whom gets booted this week, and next week is the finale — so I’ve decided to take a look instead at “The Voice,” which is several weeks away from its finale and still has, as of this writing, 16 contestants. Tonight, four will go home — one from each judge’s team. The way the process is supposed to work, the top two vote getters on each of the four teams advance, and then each judge gets to choose among his/her bottom two for who advances and who goes home. Therefore, audience votes get you in the top two, but then it’s up to the judges — not the best scenario for showcasing predictive power of social media.

Nevertheless, let’s look at the data we have. First, the contestants, their Twitter handles, their current follower count, and their Klout score:

Contestant	Twitter handle	Followers	Klout
Danielle Bradbery	@dbradbery	32352	66
Holly Tucker	@hollytmusic	10232	65
Justin Rivers	@justinrivers	58910	63
Swon Brothers	@theswonbrothers	10644	63
Sasha Allen	@sashaallenmusic	13735	68
Garrett Gardner	@garrettgardner2	16893	63
Kris Thomas	@kristhomasmusik	9721	65
Karina Iglesias	@karinaiglesias_	9024	64
Caroline Glaser	@carolineglaser	40132	68
Judith Hill	@judith_hill	36078	73
Sarah Simmons	@sarahsimmusic	20826	66
Amber Carrington	@ambercarrington	11274	63
Josiah Hawley	@josiahhawley	29100	68
Michelle Chamuel	@michellechamuel	14872	66
Vedo	@vedothesinger	59217	72
Cathia	@cathiasings	10527	66

Certain contestants “punch above their weight” when you compare their Klout score to their follower count; Klout’s metrics are proprietary but place a greater emphasis on engagement (e.g., replies, retweets, etc.) vs. pure potential audience size.

Just as a comparison point, we if we look at the three remaining contestants on “Idol,” we see the following:

Contestant	Twitter handle	Followers	Klout
Angie Miller	@angieai12	143664	79
Candice Glover	@candiceai12	78983	78
Kree Harrison	@kreeai12	74503	74

So even if “The Voice” is beating “Idol” in the ratings, the “Idol” contestants have a greater social media presence by several measures than do the “Voice” contestants. An interesting side note: all “Idol” finalists — including the ones that have long since gone home, have “verified” accounts on Twitter (meaning that Twitter considers them celebrities), whereas none of the “Voice” contestants do. This suggests to me that a deal was brokered between the “Idol” producers and Twitter.

Then we have actual buzz on social media. In my Attensity Media account, I set up a “Voice” topic and also created specific “entities” for all of the contestants that conflated their names, Twitter handles and hashtags so that I could get their counts in one place. I did set this up pretty late on Monday, after the show had aired in prime time, so the “Team Adam” and “Team Usher” contestants (who performed that evening) will have lower numbers than the “Team Blake” and “Team Shakira” contestants, who performed last night. Anyway, here’s what the live dashboard looks like:

So what do we have? We can compare two sets of numbers: general social media popularity and current week’s “buzz,” keeping in mind the grouping by judge. If we do that, we get:

Team Adam

Caroline Glaser	40132	68	2690
Judith Hill	36078	73	2500
Sarah Simmons	20826	66	1238
*Amber Carrington*	11274	63	1101

Team Blake

Justin Rivers	58910	63	4671
Danielle Bradbery	32352	66	10598
*Swon Brothers*	10644	63	4011
Holly Tucker	10232	65	5167

Team Shakira

Garrett Gardner	16893	63	5388
Sasha Allen	13735	68	5515
Kris Thomas	9721	65	4192
*Karina Iglesias*	9024	64	2439

Team Usher

Vedo	59217	72	1516
Josiah Hawley	29100	68	1784
Michelle Chamuel	14872	66	1728
*Cathia*	10527	66	1125

Again, the buzz numbers are artificially low for Team Adam and Team Usher contestants since I set up the topic late. There’s also the judges’ discretion in which of the bottom two vote-getters each judge decides to eliminate. That said, we’ll see tonight how predictive the social numbers are. Enjoy the show!

December 31, 2012 by ihersey

Auto-generated 2012 in review

Every social network seems to be offering a “year in review” feature. WordPress (who hosts this blog) is no different. Here’s their auto-generated “annual report”:

Here’s an excerpt:

600 people reached the top of Mt. Everest in 2012. This blog got about 2,800 views in 2012. If every person who reached the top of Mt. Everest viewed this blog, it would have taken 5 years to get that many views.

Click here to see the complete report.

The presentation is actually pretty slick. But the information in it is pretty mundane and easily derived — number of views, number of comments — what we in the biz call “social activity.” The one part that’s not obviously derived is the one about where people are coming from and which search terms got those people to the blog. These stats require knowledge about location of the viewer, which in the case of blogs come from a mapping of IP address to locations (there are entire companies built around this idea, Neustar being one via their acquisition of Quova), or about the referring URL for the viewer — part of what’s called “clickstream analysis” — in the case of search terms.

What’s missing for someone who really cares about social analytics is a number of other things:

How many times the blog post was linked to in Twitter, Facebook and other social networks. WordPress would need to have access to the full content stream from the other networks in order to know the real number. What they do know via clickstream analysis is when someone actually clicks on the link and goes to the page. Which of course is the most important thing — if you tweet a link and no one clicks on it, does it matter?
Even the most basic content analysis — what were the key themes discussed (you’d see a lot about “triathlon” in mine, for example, even in this post thanks to this parenthetical comment), what was the sentiment in the comments, tweets and Facebook posts that mentioned each blog entry, etc?

These two items are non-trivial to auto-generate. The former requires much more openness among the various social networks, which due to privacy concerns and policies, business model and “monetization” strategies and competition for users’ attention are becoming increasingly balkanized and locked down. The latter requires automated content analysis of the kind my company Attensity does, and believe me this stuff is hard to do accurately in a general-purpose way.

But perhaps the 2013 year in review will add a little more intelligence and start to move towards something that is actually interesting.

December 9, 2012December 9, 2012 by ihersey

Rise of the Brand Ambassador

This might be my first personal blog that talks about work. Or sort of about work. It’s also about triathlon. But only sort of. (Reader rolls eyes and awaits another disjointed blog post. Or stops reading altogether.)

My work specialty is text analytics (a combination of computational linguistics and business intelligence), mostly applied to social media these days. What that means is that we at Attensity analyze the content of what people say on social media along with all of the other “social graph” data: e.g., how influential people are, how things get retweeted, liked, +1d, etc. The critical part is how to accurately map random human language into structures that correspond to meaning, so that they can be counted, tracked and trended in useful ways. Oh, and do that on many thousands of posts per second without falling behind.

What’s come out of that work, besides a lot of variety (I’ve worked with the Fortune 500, the intelligence community and major media and entertainment companies) is a rapid-fire introduction to the business side of social media. We’ve applied our technology to everything from following the U.S. presidential elections (starting with the GOP primaries) and Arab Spring to who’s getting voted off each week on American Idol or The Voice. There’s a lot of subtlety to what goes on; it’s not just about identifying positive and negative sentiment. Particularly important is the role of influence.

Influence is harder to measure than it might seem — it’s way more complicated than how many followers you have on Twitter or friends on Facebook. Entire companies (e.g., Klout) have been built around the attempt to quantify influence, but even their presumably sophisticated metrics don’t ring entirely true to many. What’s clear is that influence is topic specific — if you look at the most-mentioned celebrities on Twitter at any given moment, for example, you’ll almost always find Justin Bieber at the top. However, on election day, if you looked at election-specific tweets, as we did for Bloomberg, the top celebrity aside from the candidates was Jay Z. (You can watch the video to find out why.)

(The reader is wondering when we’re going to talk swim, bike and run. Patience!) Topic-specific influence has created novel new ways for companies to market their products and brands: among them, the brand ambassador. If you think about traditional advertising, celebrities are often used as brand ambassadors, but celebrities are expensive. The social brand ambassador, on the other hand, doesn’t need to be a celebrity per se; they just need to influence a sufficiently large network of people on a particular topic.

Which brings me around to triathlon. I’ve found myself, quite unexpectedly, in the position of having become a brand ambassador. Not once, but twice already, and I’m likely to pull the trigger on a third. Why is that? There are many faster guys out there than I, though I’m reasonably quick for a 50-year-old age grouper. I can think of a few reasons:

I’m part of a community. I am very active on my triathlon team, and I race within my local community in addition to bigger races elsewhere.
I’m active on social media, but not overactive. I try to be interesting and honest, without oversharing. Hopefully I succeed more than I fail.
I am a gearhead. I will try almost any new product if I think it will give me an edge. I would never endorse a product just because I got it for free or heavily discounted — my litmus test is would I use this if I had to pay full price for it? Actually, in the case of TrainingPeaks and many of Wattie Ink’s sponsors, I am and already was a customer and avid user.
I work with other athletes to help them improve. I am eager to share what I’ve learned — which tools to use, which training sessions are most effective for a particular end goal — and to see my advice through to implementation. I think I’m most proud of the level I got my “Garage of Pain” training buddy Mike to this past year, even compared to my own results.

What’s ironic for my day job is that — so I’m told, anyway — one of my company’s investors at one point laid a bunch of printouts of various of my Facebook posts down during a board meeting and opined that it seemed that all I did was train and race. (If that were true, btw, I should have much better results than I’ve had.) Notwithstanding potential jealousy (he’s…um…not exactly the fittest individual on the planet) and probable violation of European privacy laws (he’s not a Facebook friend, so had no right to access any of my posts), he was missing the point on one of the central themes of a company he’s invested a lot of money in: social influence.

Becoming an influencer isn’t hard: write about what you know, be passionate, interesting and real, and connect your social presence back to a community of people at least some of whom you know in real life. Oh, and occasionally kick some ass in a triathlon or two. 🙂

An interesting side note on the confluence of work and hobby: the evening before the Wattie Ink Elite Team selection was announced, I got the following DM on Twitter:

What was funny about that was that I had been on set at The Voice a couple of nights before, thanks to our work in media. While I had made a couple of random references to it on Twitter, I had mostly posted about it to my Facebook friends. So anyway, Wattie did his homework.

Maybe I’ll get him to be my lead investor next time. 🙂

Twitter handle	Followers 5/14	Followers 5/21	Delta	Percentage
@theswonbrothers	14750	19588	4838	33%
@michellechamuel	21305	27370	6065	28%
@josiahhawley	40894	50517	9623	24%
@dbradbery	45723	55430	9707	21%
@hollytmusic	13454	16273	2819	21%
@ambercarrington	14520	17149	2629	18%
@kristhomasmusik	12130	13966	1836	15%
@sarahsimmusic	25314	28828	3514	14%
@sashaallenmusic	16674	18985	2311	14%
@judith_hill	39953	43041	3088	8%

spinning

by day I'm a software exec; by night I'm a lifelong, moderately talented endurance athlete