Blog

Real-Time Audience Amplification

Enabling advertisers to purchase “lookalikes” is a key trend in online advertising these days. In fact, Quantcast has recently changed their website to highlight this opportunity! And more and more companies are being added to the mix.

This trend will continue to be accelerated by the unbundling of media from data and the rapid growth of real-time bidding (RTB) platforms. Overall, we are witnessing the ever-increasing value that audience data brings to the marketplace (great info-graphic below):

Ad Value Chain

(JEGI Estimates)

It can be argued that (high-performance) audience data will be a key ingredient in the ad value chain, and this will continue to put pressures on other players in the space.

From a lookalike modeling perspective, advertisers can now amplify their audience buys through companies that find other users are related to their core audience in some way. Interestingly, Media6 does this by looking at the social graph for related users. Their premise is that “birds of a feather flock together”, and that homophily is a pervasive characteristic of friendship networks. Other companies look at profile-centric relations, like overlapping interests or segments, where similarity between users is based on correlation over some apriori characteristics or features. These techniques tend to be very monotone and binary, either a relationship exists or it doesn’t. In some cases, they look at reciprocity and frequency. However, they tend to ignore the temporal nature of behavior to the degree that (a) a consumer’s current state affects his/her future movements and (b) that a decay function should regulate the saliency of more recent events.

To increase the performance and the future potential of audience amplification, we hold to the following:

There are a myriad of ways in which consumers can be connected, both explicitly (defined friendship or interest) and implicitly (behaviorally). The context of the relationship matters. To Sociocast, context means interests and intents.

Amplification is the flip-side of influence and communication flow. In networks we use individuals as proxies for others, thereby increasing the size of particular characteristics. That is, there needs to be a social network centric view on how audiences are amplified, but this network needs to be contextual and influence based.

This is the Sociocast Hypergraph.

Share by email

On Predicting Social Links

In a reply letter to Adams’ “Distant Friends, Close Strangers? Inferring Friendships from Behavior“, Pentland, Eagle et. al. write:

…We believe that even studies of inherently social phenomena, such as the spread of influence or supposed “social contagions”, can benefit strongly from a focus on objective behavioral data. For instance, the conventional wisdom is that social influence only travels along self-perceived ties. However, in truth, it remains unknown how much is being hidden from us by recency and saliency cognitive filters, and significant social influence, may, in fact, travel across unperceived ties. Behavioral data are not prone to such filters and thus, when used properly, may shed considerable light on such important questions.

When we look at understanding consumer preferences, within the backdrop of social interaction and influence, we can benefit by capturing behavioral data (as oppose to explicit social relationship data). To the extent that we can “infer” relationships between people based on their behaviors, we would need to test whether the relationships inferred are correct. But how?

If the predictions that we make on an individual’s behavior (using inferred social relationships) are correct, than our networks must correctly model reality.

Behavioral Network

(VisualComplexity.com)

What this means is that the implied social network we extrapolate from user behavior is right if it leads to the right predictions. But this raises important questions:

  • Can we really say that the networks are (quasi-)isomorphic?
  • Are there (almost infinite) permutations of real networks that can all equally be used for real-world prediction?
Share by email

Identity and Control

Just read Bynamite’s article in NYTimes. Interesting theory. Not sure it has legs. I forget where, but I once read that telling someone to share their data in exchange for better relevancy is like telling them to invest in hedge funds because it makes the markets more efficient.

I have to agree.

Bynamite

(Bynamite.com)

What matters besides control is incentive. We must tie action to immediate reward. We must tie every user to a measure of influence and a notion of identity.

While we are focused on building the most advanced, most real-time and predictive audience data, in early 2011 we will be releasing our next-generation consumer dashboard. Get ready!!

Share by email

The Hidden Pattern Behind Everything We Do

Albert Einstein said, “If you can’t explain it simply, you don’t understand it well enough.”  Barabasi follows this favorite principle of mine in his new book Bursts. It is a revolutionary new theory showing how we can predict human behavior.

Barabasi shows how everything that we do has an inherent rhythm. We work, play and engage with others in short flourishes of activity followed by next to nothing. These bursts follow a simple power-law that seems to characterize everything in nature. Everything from how web pages are linked to each other to how diseases spread from one continent to the other. That is elegant simplicity!

How does this relate to what we do at Sociocast, you may ask. At Sociocast, we are great believers in coming up with a simple, and elegant explanation to how we discover and consume information on the web. We look at your behavior online and make inferences about how you stumbled upon a piece of content and what kinds of content you might like — looking at your behavior and behaviors of others like you. We can then predict where you’d like to go next based on what you are currently reading. Why does this work? This works because we all have limited attention spans and therefore we have to prioritize in order to get things done. In fact, this prioritization that we do is what causes the bursty activity that Barabasi explains so elegantly in his book.

That brings me to my all time favorite Einstein quote: “Everything must be made as simple as possible, but not simpler.” That’s what we are doing at Sociocast.

Share by email

Social Dynamics = Relevance (On Social Mobility – Part III)

Just minutes before writing this post, I was lazily browsing on Amazon.com. My attention was called to a module titled “New For You,” where it was suggested that I buy “Fundamentals of Matrix Computations” by David S. Watkins. Boring, sure. Certainly a bit drab for my then current mood. But there was some sense to their suggestion. Sure, in the past, I have surfed for my share of obscure linear algebra books, even purchased a couple. So although the underlying black-box of algorithms that generated the recommendation is hidden, I know the wisdom of the crowds powers its results. Particularly, a technology called Collaborative Filtering (CF).

Crowds

(Google Images)

Now when I first began researching CF (back when I was young and beautiful), it made immediate sense that consumers with similar histories and like ratings on things could somehow be collected into neighborhoods of similarity and that those neighborhoods could be used as proxies for consumer recommendations. That we can exploit the browsing or purchasing behavior of the collective to help individuals discover new things they will like, is somehow intuitive.

It was around that time that I stumbled upon a paper called Influence in Ratings-Based Recommender Systems, that uncovered a fundamental insight: If we were to define a measure as to the effect that one consumer has on another’s recommendation, based on the CF algorithm, we could determine the implied influence that one user has on another (of course, granted that our predictions are correct). Now if we were to do this for the whole network of consumers, we would effectively be generating an influence network (or influence graph), in which the nodes are consumers and the edges represent the direction and the strength of the influence (for more on social graphs, see my previous post). This is incredibly interesting! But we are left with a question: Why does this work?

There are some interesting properties about real-world networks worth noting:

  • Small-world – networks have small characteristic path lengths, ensuring that any two nodes are reachable in few hops (6-degrees of separation).
  • Clustered – a high degree of “cliqueness”. My friends tend to be friends with each other (we refer to this as redundancy).
  • Scale-free – number of connections for nodes follows a (heavy-tailed) or power-law distribution.

Without going too deep into these ideas, the properties above ensure that when it comes to real-world networks, we are able to:

  • Predict and Compress – prediction and compression are the same in this regard. If we understand the probabilistic distribution and redundancy of node linkages, we are able to tightly compress the network (see information theory). Low entropy implies a high predictability, so the two features are tightly related.
  • **Pattern Storage – in this regard networks act like associative memories, they are able to store patterns (or memories) through the topology of interconnections between nodes (like a neural network). Thanks to Manny Aparicio at Saffron Technology for this insight!

When we look at the Sociocast Hypergraph, its multi-relational nature allows us to store the myriad of ways in which people influence each other. We extrapolate and infer these connections between people, based on their behaviors (their movements through ideas, topics, and concepts) and explicitly through their social relationships online. Our model allows us to build a network that mimics the real network, to the extent that it stores the same memories. Memories, for us, are the opinions people have for (ranking) and the associations (relationships) they have between things.

The underlying structure of real-world networks, and their natural redundancy, give us an incredible predictor of individual behaviors.

We will expound on this, with empirical evidence, in our forthcoming whitepaper.

Share by email

The Sociocast Hypergraph (On Social Mobility – Part II)

While graph theory has its roots in Euler’s Königsberg bridges, with a myriad of applications since then, its relationship to social networks has certainly been spotlighted by Zuckerberg’s Social Graph.

In its most basic form, a graph represents a set of entities and their relationships (for a great introduction to network theory, see Bradford Cross’ post about Network Theory). We refer to the entities nodes and the lines edges. In Facebook’s Social Graph, the nodes represent people and the edges represent the existence of friendship between them.

Simple Graph

(Wikipedia)

Certainly, we can derive very interesting properties and patterns from simple graphs. We can study the centrality of individual entities, their prestige and importance, their connectivity. But to take our analysis further, we can relieve the simple graph of some of its innate limitations:

  • Directed – we add direction to each edge, to model a flow phenomenon (i.e. influence). This means that the relationship between two individuals is not necessarily symmetric (think followers on Twitter).
  • Weighted – we add a value to each edge which signifies the strength of the relationship. Not every influence relationship has the same intensity.
  • Multiplex – we allow more than one edge between any two nodes, creating nodes that have multiple types of relationships with each other.
  • Self-loops - we allow our individual nodes to connect to themselves, representing autonomy and self-influence.

While generalizing our notion of a simple graph adds complexity, it enables us to better model the reality in which individual behaviors take place.

This is the Sociocast Hypergraph, a directed, weighted, multi-relational, self-looped graph, where the edges represent the contexts (What does human mobility mean when it’s online? (On Social Mobility – Part I)) between individuals.

Share by email

What does human mobility mean when it’s online? (On Social Mobility – Part I)

Location implies specificity in the physical world. If you are here, you cannot be there. To mark your position, we can attach spatial coordinates (like latitude and longitude).

In Reality Mining as well as in Barabasi’s research on human mobility, data is derived from human mobility patterns in physical space. Mobile phone tower data is used as a proxy for determining individual movements. For each user, a mobility network is constructed, where the nodes are specific locations and the paths between the nodes represent transitions. The mobility network is used to predict an individual’s future movements to considerable accuracy.

To translate this thinking to the online world, we must first give meaning to our notion of “location.”

The obvious solution may be to consider using web pages. In Sergei Brin and Larry Page’s initial whitepaper, a probabilistic model of the web graph was built to determine the importance of web pages for ranking search results in Google. The individual nodes are web pages and the paths between the pages represented web links. In their model, location is defined as particular web pages (web surfers transition between web pages). This is not ideal for our model for several reasons:

  • Web pages intrinsically represent content, and content represents ideas. What we would like to build is an understanding of the associations between ideas or concepts. While the web certainly contains an implicit relational structure between topical areas, there is a finer semantic granularity to be deduced.
  • The web is a messy place. Web link structure may not always be an indicator of importance, nor do two pages dealing with the same subject represent two distinct locations. Content and concepts are the intrinsic “atoms” which we would like to model and understand.

At Sociocast, we process immense amounts of structured and unstructured content (what we call user observations) and apply natural language processing techniques to extract their meaning. Once we deduce the particular contexts or topics within an observation, we can determine a user’s location by inserting them into what we call Context Space.

We look to understand the associations between concepts, by observing how people transition between them (sequentially).

A simple but powerful idea.

Semantic Web

(Flickr)

Share by email

On Social Mobility

If you’ve happened to read Barabasi’s article in Science Magazine, “Limits of Predictability in Human Mobility,” you may already realize the importance of temporal data as a predictor for human behavior.

Barabasi Science

(Science)

If curiosity kept you digging, you would have quickly discovered that the research also has its roots in Nathan Eagle and Sandy Pentland’s Reality Mining project at MIT. The Reality Mining project utilizes mobile phone data to create predictive models for individuals as well as for the complex social systems in which these individuals are embedded. The aim, of course, is to better understand and forecast human behavior.

In Barabasi’s Science article we read that by using mobile phone data, he and his team were able to formulate a 93% potential predictability in user mobility within their data set. This means that they are able to predict, 93% of the time, where a user was going to be (based on their last several locations). Using this model, they found that individuals exhibit considerable low entropy (high predictability), pointing to a ‘deep-rooted’ regularity in our behaviors (we encourage you to read the Science paper for more details).

Topic Map

(VisualComplexity.com)

This is an incredible finding: A considerable part of of our predictive power is embedded in the sequence of our movements, rather than just the frequency or propensity that we exhibit for some action! Time matters. And so does order. (We recommend a great paper on Collaborative Filtering with Temporal Dynamics).

At Sociocast, we aim to apply this same thinking to the online world, to generate the same level of predictability.

And that’s exactly what we do.

In the following blog posts we explore Sociocast’s techniques for predicting human mobility online using behavioral data with social network dynamics.

Share by email

Sociocast NOT Sociopath!

…The words are similar. When asked about their (homophonic) similarity, we always answer: “the distance between insanity and genius is measured only by success.”

Welcome to Sociocast, where we are not afraid to think differently, to rethink traditional science, to shift paradigms, to launch revolutions. By revisiting and re-architecting our understanding of human (and social) behavior, Sociocast will drastically improve our experiences online.

Richard Feynman

(Richard Feynman)

Launching in Fall 2010, Sociocast’s real-time audience data will be used by both advertisers and publishers to optimize the delivery of advertising and facilitate the creation and distribution of content. All the while, building great tools for people like us to control our privacy and improve our ability to discover the content we love.

To find out more, visit: http://www.sociocast.com.

Share by email

Information

Sociocast discovers and delivers the most predictive REAL-TIME audience data in the marketplace.

Archives