Archive for October, 2011

What is Big Data?

The folks over at IBM Research did a real good job defining what Big Data is, with a slant toward the enterprise, with full credit to the team over we thought we would share their definition:

Everyday, we create 2.5 quintillion bytes of data–so much that 90% of the data in the world today has been created in the last two years alone. This data comes from everywhere: from sensors used to gather climate information, posts to social media sites, digital pictures and videos posted online, transaction records of online purchases, and from cell phone GPS signals to name a few. This data is big data.

Big data spans three dimensions: Variety, Velocity and Volume.

Variety – Big data extends beyond structured data, including unstructured data of all varieties: text, audio, video, click streams, log files and more.

Velocity – Often time-sensitive, big data must be used as it is streaming in to the enterprise in order to maximize its value to the business.

Volume – Big data comes in one size: large. Enterprises are awash with data, easily amassing terabytes and even petabytes of information.

Share by email

Why Advertising? 15 billion connected Devices by 2015…that’s why.

At Sociocast we are applying Human Mobility theory to big unstructured data sets to make the data more useful. One of the first ways that we applying that power is through marketing and advertising services. Why?

Because it represents a huge challenge.

There is so much raw unstructured data available about web usage – there are currently 4 billion devices, like computers, phones, televisions, and even refrigerators connected to the internet, growing to 15 billion connected and addressable devices by 2015. Each one of these devices creates usage data as they perform their various tasks. Many of these devices are also “addressable” allowing for communications/messages to be directed back at selected devices. But which messages should go to which devices? How do we reduce the “signal to noise” ratio for incoming messages to a device. We see that as huge and interesting problem to solve…and a lucrative problem as well. Marketers, Advertisers, and Advertising Agencies covet technologies that can place their messages in front of relevant consumers.

Listen as Vice President of Intel’s Architecture Group Kirk Skaugen explains the massive scale of the internet to an audience at the Web 2.0 Summit in San Francisco.

Share by email

Pattern Recognition and Data Visualization

At Sociocast we are always talking about pattern recognition and data visualization. These are both important elements in our mission of making Big Data useful. We came across an absolutely gorgeous Data Visualization from artist Aaron Koblin. Aaron uses Data from the U.S. Federal aviation administration to create animations of flight traffic patterns and density.

Data can be sexy and beautiful, Enjoy.

Share by email

Big Ideas Needed for Big Data

Carla Rover’s article,  ‘Ad Tech’s Flip Factor’ from DiGiDAY pointed out that we are seeing a huge rise in ad tech, however the really deep-rooted problems of the industry such as ‘measurement, inventory quality, and data management’ are not being adequately addressed.

Big Data utilized correctly can solve these issues. The industry is booming and the landscape is constantly shifting. However, only a select few are innovating and coming up with new and creative ways to attack some of the industry’s toughest questions. The industry also needs to start seeing companies coming up with more end to end models. Once this happens marketers will be more empowered, consumers will be happier and brands will thrive.

Big Data for a small world. 

Share by email

From Punch Cards to Clouds

Maybe the sky is the limit when it comes to data storage… but I doubt it.

Do you ever wonder where all the data being collected in the world is actually stored? Without the significant advances in data storage Big Data as we know it would not exists. Storage should not be taken for granted.

‘The History of Digital Storage’ gives a simple, but effective account of the tremendous advances made in this feild. It makes you realize just how far we have come and in such a short time. Even more so it makes you contemplate where we will go from here.

Its an exciting time for Data enthusiasts!

Mashable's History of Digital Storage Infographics

Big Data for a Small World.

 

Share by email

Big Data is Empowering…

Big Data is empowering and not just to Marketers. Whether you are in the fashion industry trying to target a niche group of buyers or a politician trying to run a re-election campaign Big Data has a significant role to play.

Almost every industry or field out there can be empowered by Big Data. It’s no longer a question of if but when people start to utilize this. We have already seen people using similar data sets to predict revolutions, help sequence the human DNA code, predict the next pandemic, and understand how well children are learning across schools.

This emerging industry is not just a marketers dream it’s a tool for almost any researcher in any field. The future I see is not one where information is something that only the few have and are able to capitalize on. It will be one where information is transparent, open and free. The game will be how well people are able to, in real time, process and analyzes the unstructured data that is being produced in epic proportions these day.

Big Data has infinite possibilities and applications it’s time we get real about it.

Share by email

Hadoop is an essential tool for massive data crunching

In the midst of an “interesting” Oracle OpenWorld, there have been a few voices (including Oracle) saying that Hadoop (and I guess Map/Reduce in general) is best suited for basic data ingestion and maybe cheap, convenient storage (in HDFS).

In the words of DJ Kool, let me clear my throat…

Here at Sociocast, we’ve been using Map/Reduce (primarily Hadoop) going on 2 years. In addition, we’ve worked with the technology for another 2 years before that. I’m not sure how many folks out there are dealing with true Internet-scale data crunching across billions of behavioral data points daily. But if you are dealing with Big Data for a Small World, Hadoop is an essential tool for massive data crunching on the cheap. Traditional systems either can’t process data at this scale fast enough (or at all) or can’t do it at the same dollar value and low administration overhead (e.g., number of Sys Admins and DBAs).

And it’s not just for data ingestion. Here are some of the other ways we use the technology across big data sets:

  • Dead-simple worker grid across large input sets with intelligent prioritization
  • Classifying billions of web events per day
  • Understanding 100s of millions of users based on real-time behavior across page views, search and social activity
  • Continuously modeling a dynamic social influence graph of 100s of millions of users across the entire Internet, not just on your favorite social network
  • Dynamic campaign audience selection from the entire user base

People just can’t process Internet-scale data and make it deliver real business value at a cost-effective price point without having Map/Reduce in their toolkit. With Map/Reduce, we’re able to deliver some pretty amazing services at scale.

  • URL and Audience Classification services across Internet-wide data sets
  • Intelligent audience expansion that maintains performance while extending reach
  • Dynamic marketing campaign optimization in-flight across 100s of millions of users

Hadoop just for data ingestion? Not if you actually know what you’re doing and don’t have legacy products to protect.

In the meantime enjoy DJ Kool!

Share by email

Influencer Conference

Albert with Ari Goldberg

On Wednesday October 5th, Albert Azout, (Sociocast’s CEO) joined a panel of speakers from the Influencer Conference held in New York City. Along side Albert were Ari Goldberg (CEO, Stylecaster) Sarah Conley (StyleITOline.com) Anothy Santagati (Ecommerce/digital@ Lacoste), and Dayanne Danier, (Head Designer, Byen Abye).  The moderator for the session was Simmone Oliver, (NYT Fashion and Style Producer)

The Topic was:“The complexion of the fashion industry has changed and continues to morph. The world of digital competition and game analytics has changed the playing field and emerging and established brands are both wrestling with the opportunities and challenges in this new world. What does the future bring to the fashion industry as technology and creative drive the business forward?” – influencercon.com

Albert is sitting all the way on the right!

Albert explained the importance of looking at how people behave  and not what they say they do. By using Big Data sets we are able to create highly accurate information about consumers habbits and how they actually behave. The information that is generally collected and used by marketers is based offwhat people say they like, which is as Albert says ‘not the most effective way of targeting consumers’. The mass majority of people will say they like a particular page and then never return to it. Ari Goldberg added an interesting analogue to further Albert’s point, ‘everyone in New York City claims they read the NYT, however what you actually see on the street is people reading the New York Post!”

Share by email

The Power of Big Data

Big Data is here.

Today we have access to unprecedented amounts of data these days.  Alfredo Gangotena, CMO of Mastercard, affirms this. He says:

“The mass of consumers will conduct their transactions in ways that are immediately captured by a processor—and that means generating data that will be available for analysis.”

What’s more is people have already been doing this for quite some time.

The true power of having such a massive amount of data lies in the ability to perform analysis on it. From here we can begin to understand the behaviors of people in ways that were simply not feasible at any other time in history. It’s an exciting time for marketers. The ability to create precise models of niche groups through Big Data is opening up new methods of communicating with consumers that was not possible before. Coupled with the increasingly available mobile data, companies can not only learn in real time, but interact with their customers in real time. Making marketers so relevant to their customers has never been more possible (or important.)

In short, it will not just be about how effectively companies learn how to use these large swaths of data but also how fast they can process it.

The question you should be asking yourself is do I already have big data and how do I unlock the power?

Share by email

A Trillion Pages on the Internet (Understated: IMHO)

Scott Hoffman passed on this really interesting article to me from CNN this afternoon titled, ‘How Many pages are on the internet?’. I have never really considered that question before. After thinking about it for a moment I realized I had no clue! Is it 1 billion? 100 billion? A trillion? 10 trillion? Well, according to the World Wide Web Foundation they may have an answer for us in the rapidly approaching New Year. Regardless, if they succeed or not, this article got me thinking about how much data is actually out there. I can only imagine the amount of data that has been produced in this day alone. The amount of sheer data that is out there is just staggering.

Share by email

Information

Sociocast discovers and delivers the most predictive REAL-TIME audience data in the marketplace.