Oral interpretation and language teaching's Fan Box

Search This Blog

Wednesday, November 17, 2010

Gravity introduces system that studies your social interests




We built Gravity to help the right information find you.

Today, we live in a world in which we’re constantly bombarded by information: 90 million tweets per day, 35 hours of video uploaded per minute, 1.6 million blog posts per day and an average of 130 friends demanding our attention. With so much information being created on a daily basis, it’s hard to find what you’re looking for and impossible to know what you’ve missed.

Our answer is the Interest Graph: an online representation of your real world interests and a new lens through which to view the internet. Your Interest Graph is your own personal electromagnet. It pulls the best stuff to you based on your interests and leaves all the noise at a safe distance where it can’t distract you. We build your Interest Graph by analyzing social data (like tweets, retweets, status updates, likes and shares) to create a holistic picture of who you are and what you’re interested in. When you click personalize on a site that uses Gravity, your electromagnet gets activated and the right information starts flowing your way. We connect you with content, people and advertising based on the probability that you’ll love it. It’s about helping the right information find you.

With our platform, any website will be able to tap into Gravity's Interest Graph and enable personalized experiences for their users.


Technology
Overview

If you want to geek out on the inner workings of Gravity, read on. If you’re looking for a more simple description of what Gravity is, head to About Gravity.

Your experience on the web should reflect who you are. We call this the personalized web and we’re developing technology to make it happen.

In theory, it’s a simple idea. In practice it’s challenging to implement, a pain in the neck to collect the right data, and near-impossible to do something useful and fun with the data… but that’s our plan. Why embark on such a challenging endeavor? We think personalizing the web will make user experiences more interesting and more fun.

Two big things are needed to deliver a personalized web experience. First, you need to index the web and tag everything. You need to know what websites, content, media, products, ads, etc. are out there and what topics and interests they cover. Second, you need to create an interest graph for each person. You need to understand what people are interested in and how interested they are. Combine a rich web index and an interest graph and you can do some pretty interesting things, like filter the web based on a person’s interest.

At the heart of Gravity is a semantic engine that extracts interests from any source of information. The engine is used to understand the web and the people interacting with it. It enables us to classify information with our web ontology, to build interest graphs and to deliver a variety of personalized web experiences. We’ve filed several patents related to our technology and its application to building interest graphs and enabling the personalized web.
Semantic Engine

The basic function of the semantic engine is to take a blob of text and figure out what it’s about — like what you’re brain does when you read a newspaper article.

The first step to is to analyze the blob. We use natural language processing (NLP) extensively. Instead of simply identifying keywords, our semantic engine analyzes a variety of linguistic and statistical factors. The output of our analysis is highly structured data that contain the key characteristics of the blob — sort of like a sample of the blob’s DNA.

The next step is to compare the blob’s DNA to a DNA database; our DNA database is our dynamic web ontology. The goal is to match the DNA of the blob to DNA signatures of topics in our ontology. Your standard blob of text (a sentence) will typically generate several matches. A paragraph or page of content may generate hundreds of topic matches.

Finally, we use a modified convergence algorithm to convert a list of topic matches to a handful of interests. Our algorithm prioritizes interests based on the number of topics related to an interest and the average distance and the strength of the relationship between each topic and the interests.

That’s it. Like we said, the engine takes a blob of text and figures our what it’s about.
Web Ontology

The web ontology is our DNA database. It’s actually a data graph (not a database), which does an excellent job at representing relationships between data and lends itself to graph traversal algorithms. Our ontology has 7+ million topical nodes, further augmented by metadata we’ve created or sourced from other web services. Gravity’s semantic engine uses the dynamic web ontology to match the signature of unidentified text blobs to the signatures of known topics.

Our web ontology is built on top of Wikipedia, DBpedia, Yago and OpenCyc. These resources give our ontology its basic structure, including relationships between topics. Next, we mine the open web for topical content and related social data and map this content to the ontology — thus creating a web ontology. The web ontology is different from a normal ontology in that it’s augmented with linguistic, statistical and other metadata extracted from the web. This metadata provides the basis for the DNA signatures of topics in our database. We’ve crawled terabytes of data from major publishers, blogs, forums and other public sources of information to create the most accurate DNA signatures possible.

We built our web ontology to be adaptive over time. One of the most powerful aspects of open sources resources like Wikipedia is crowd-sourced human curation. We used a combination of human and algorithmic curation to enable our ontology to grow and adapt as the world evolves.
The Theory Behind the Interest Graph

The interest graph is an online representation of your real-world interests and a new lens through which you can play with the web. We want your interest graph to become your personal filter for the web and to help you to discover content, ideas, people, events, products and services that you’ll like.

Building an interest graph is an iterative process. Instead of just asking you what your interested in, we start by pull in information about you from the web (this might include your tweets, status updates, things you’ve liked or shared, blog articles you’ve written, or information form your profile on social networks). We use our semantic engine and dynamic web ontology to comb through your information and to identify potential interests. Our convergence algorithms reduce all your potential interests to the most meaningful ones and enable us to establish preliminary interest intensity.

Once we’ve created your initial interest graph, we give it a life of its own. Your interest graph changes dynamically over time as you interact with Gravity powered services, change your presence on the web, or makes changes to your interest graph directly. For example, we might add legos to your interest graph because of several recent tweets related, but maybe you don’t really like legos. You’ll always have an opportunity to remove interests or reduce your level of attachment.

No comments:

Post a Comment