All Downloads are FREE. Search and download functionalities are using the official Maven repository.

org.grouplens.lenskit.knn.item.package-info Maven / Gradle / Ivy

There is a newer version: 3.0-T5
Show newest version
/*
 * LensKit, an open source recommender systems toolkit.
 * Copyright 2010-2014 LensKit Contributors.  See CONTRIBUTORS.md.
 * Work on LensKit has been funded by the National Science Foundation under
 * grants IIS 05-34939, 08-08692, 08-12148, and 10-17697.
 *
 * This program is free software; you can redistribute it and/or modify
 * it under the terms of the GNU Lesser General Public License as
 * published by the Free Software Foundation; either version 2.1 of the
 * License, or (at your option) any later version.
 *
 * This program is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
 * FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
 * details.
 *
 * You should have received a copy of the GNU General Public License along with
 * this program; if not, write to the Free Software Foundation, Inc., 51
 * Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA.
 */
/**
 * Implementation of item-item collaborative filtering.
 * 

* The item-item CF implementation is built up of several pieces. The * {@linkplain org.grouplens.lenskit.knn.item.model.ItemItemModelBuilder model builder} takes the rating data * and several parameters and components, such as the * {@linkplain org.grouplens.lenskit.vectors.similarity.VectorSimilarity similarity function} and {@linkplain ModelSize model size}, * and computes the {@linkplain org.grouplens.lenskit.knn.item.model.SimilarityMatrixModel similarity matrix}. The * {@linkplain ItemItemScorer scorer} * uses this model to score items. *

* The basic idea of item-item CF is to compute similarities between items, typically * based on the users that have rated them, and the recommend items similar to the items * that a user likes. The model is then truncated — only the {@link ModelSize} most similar * items are retained for each item – to save space. Neighborhoods are further truncated * when doing recommendation; only the {@link org.grouplens.lenskit.knn.NeighborhoodSize} most similar items that * a user has rated are used to score any given item. {@link ModelSize} is typically * larger than {@link org.grouplens.lenskit.knn.NeighborhoodSize} to improve the ability of the recommender to find * neighbors. *

* When the similarity function is asymmetric (\(s(i,j)=s(j,i)\) does not hold), some care * is needed to make sure that the function is used in the correct direction. Following * Deshpande and Karypis, we use the similarity function as \(s(j,i)\), where \(j\) is the * item the user has purchased or rated and \(i\) is the item that is going to be scored. This * function is then stored in row \(i\) and column \(j\) of the matrix. Rows are then truncated * (so we retain the {@link ModelSize} most similar items for each \(i\)); this direction differs * from Deshpande & Karypis, as row truncation is more efficient & simpler to write within * LensKit's item-item algorithm structure, and performs better in offline tests against the * MovieLens 1M data set * (see writeup). * Computation against a particular item the user has rated is done down that item's column. *

* The scorers and recommenders actually operate on a generic {@link org.grouplens.lenskit.knn.item.model.ItemItemModel}, so the * item-based scoring algorithm can be used against other sources of similarity, such as * similarities stored in a database or text index. */ package org.grouplens.lenskit.knn.item;





© 2015 - 2025 Weber Informatics LLC | Privacy Policy