Content-Based Recommender Systems I Introduction

Basics of Content-based Recommender Systems

Systems implementing a content-based recommendation approach analyze a set of documents and/or descriptions of items previously rated by a user, and build a model or profile of user interests based on the features of the objects rated by that user.
The profile is a structured representation of user intersts, adopted to recommend new interesting items.The recommendation process basically consists in matching up the attributes of the user profile against the attributes of a content object.

A high level architecture

Content Analyzer When informatio nhas no structure , some kind of pre-processing step is needed to extract structured relevant information. The main responsibility of the component is to represent the content of items coming from information sources in a form suitable for the next processing steps.

Profile Learner This module collects data representative of the user preferences and tries to generatlize this data, in order to construct the user profile. Usually, the generalization strategy is realized through machine learning techniques, which are able to infer a model of user interests starting from items liked or disliked in the past.

Filtering Component This module exploits the user profile to suggest relevant items by matching the profile representation against that of items to be recommended. The result is a binary or continuous relevance judgment, the latter case resulting in a ranked list of potentially interesting items.


User Independence Content-based recommenders exploit solely ratings provided by the active user to build his own profile. Instead, collaborative filtering methods need ratings from other users in order to find the "nearest neighbors" of the active user, i.e., users that have similar tastes since they rated the same items similarly.

Transparency Explanations on how the recommender system works can be provided by explicitly listing content features or descriptions that caused an item to occur in the list of recommendations. Conversly, collaborative systems are black boxes since the only explanation for an item recommendation is that unknown users with similar tastes liked that item;

New Item Content-based recommenders are capable of recommending items not yet rated by any user. As a consequence, they do not suffer from the first-rater problem, which affects collaborative recommenders which rely solely on users' preferences to make recommendations.


Limited Content Analysis content-based techniques have a natural limit in the number and type of features that are associated, whether automatically or manually, with the objects they recommend. Domain knowledge is often needed. No content-based recommendation system can provide suitable suggestions if the analyzed content does not contain enough information to discriminate items the user likes from the items the user doesn't like.

Over specialization content-based recommenders have no inherent method for finding something unexpected. The system suggests items whose scores are high when matched against the user profile, hence the user is going to be recommended items similar to those already rated. This drawback is also called serendipity problem to highlight the tendency of the content-based systems to produce recommendations with a limited degree of novelty.

New User Enough ratings have to be collected before a content-based recommender system can really understand user preferences and provide accurate recommendations.