请输入您要查询的百科知识:

 

词条 Cold start (computing)
释义

  1. Systems affected

     New community  New item  New user 

  2. Mitigation strategies

     Profile completion  Feature mapping  Hybrid feature weighting 

  3. See also

  4. References

  5. External links

{{about||the process of restarting a computer without performing any shut-down procedure|hard reboot}}{{otheruses|Cold start (disambiguation){{!}}Cold start}}{{Recommender systems}}

Cold start is a potential problem in computer-based information systems which involve a degree of automated data modelling. Specifically, it concerns the issue that the system cannot draw any inferences for users or items about which it has not yet gathered sufficient information.

Systems affected

The cold start problem is a well known and well researched problem for recommender systems. Recommender systems form a specific type of information filtering (IF) technique that attempts to present information items (e-commerce, movies, music, books, news, images, web pages) that are likely of interest to the user. Typically, a recommender system compares the user's profile to some reference characteristics. These characteristics may be related to item characteristics (content-based filtering) or the user's social environment and past behavior (collaborative filtering).

Depending on the system, the user can be associated to various kinds of interactions: ratings, bookmarks, purchases, likes, number of page visits etc.

There are three cases of cold start [1]:

  1. New community: refers to the start-up of the recommender, when, although a catalogue of items might exist, almost no users are present and the lack of user interaction makes it very hard to provide reliable recommendations
  2. New item: a new item is added to the system, it might have some content information but no interactions are present
  3. New user: a new user registers and has not provided any interaction yet, therefore it is not possible to provide personalized recommendations

New community

The new community problem, or systemic bootstrapping, refers to the startup of the system, when virtually no information the recommender can rely upon is present.[2]

This case presents the disadvantages of both the New user and the New item case, as all items and users are new.

Due to this some of the techniques developed to deal with those two cases are not applicable to the system bootstrapping.

New item

The item cold-start problem refers to when items added to the catalogue have either none or very little interactions. This constitutes a problem mainly for collaborative filtering algorithms due to the fact that they rely on the item's interactions to make recommendations. If no interactions are available then a pure collaborative algorithm cannot recommend the item. In case only a few interactions are available, although a collaborative algorithm will be able to recommend it, the quality of those recommendations will be poor.[2]

This arises another issue, which is not anymore related to new items, but rather to unpopular items.

In some cases (e.g. movie recommendations) it might happen that a handful of items receive an extremely high number of interactions, while most of the items only receive a fraction of them. This is referred to as popularity bias.[3]

In the context of cold-start items the popularity bias is important because it might happen that many items, even if they have been in the catalogue for months, received only a few interactions. This creates a negative loop in which unpopular items will be poorly recommended, therefore will receive much less visibility than popular ones, and will struggle to receive interactions.[4] While it is expected that some items will be less popular than others, this issue specifically refers to the fact that the recommender has not enough collaborative information to recommend them in a meaningful and reliable way.[5]

Content-based filtering algorithms, on the other hand, are in theory much less prone to the new item problem. Since content based recommenders choose which items to recommend based on the feature the items possess, even if no interaction for a new item exist, still its features will allow for a recommendation to be made.[6]

This of course assumes that a new item will be already described by its attributes, which is not always the case. Consider the case of so called editorial features (e.g. director, cast, title, year), those are always known when the item, in this case movie, is added to the catalogue. However, other kinds of attributes might not be e.g. features extracted from user reviews and tags.[7] Content-based algorithms relying on user provided features suffer from the cold-start item problem as well, since for new items if no (or very few) interactions exist, also no (or very few) user reviews and tags will be available.

New user

The new user case refers to when a new user enrolls in the system and for a certain period of time the recommender has to provide recommendation without relying on the user's past interactions, since none has occurred yet.[8]

This problem is of particular importance when the recommender is part of the service offered to users, since a user who is faced with recommendations of poor quality might soon decide to stop using the system before providing enough interaction to allow the recommender to understand his/her interests.

The main strategy in dealing with new users is to ask them to provide some preferences to build an initial user profile. A threshold has to be found between the length of the user registration process, which if too long might indice too many users to abandon it, and the amount of initial data required for the recommender to work properly. [9]

Similarly to the new items case, not all recommender algorithms are affected in the same way.

Item-item recommenders will be affected as they rely on user profile to weight how relevant other user's preferences are. Collaborative filtering algorithms are the most affected as without interactions no inference can be made about the user's preferences.

User-user recommender algorithms [10] behave slightly differently. A user-user content based algorithm will rely on user's features (e.g. age, gender, country) to find similar users and recommend the items they interacted with in a positive way, therefore being robust to the new user case. Note that all these information is acquired during the registration process, either by asking the user to input the data himself, or by leveraging data already available e.g. in his social media accounts.[11]

Mitigation strategies

Due to the high number of recommender algorithms available as well as system type and characteristics, many strategies to mitigate the cold-start problem have been developed. The main approach is to rely on hybrid recommenders, in order to mitigate the disadvantages of one category or model by combining it with another.[12] [13] [14]

All three categories of cold-start (new community, new item, and new user) have in common the lack of user interactions and presents some commonalities in the strategies available to address them.

A common strategy when dealing with new items is to couple a collaborative filtering recommender, for warm items, with a content-based filtering recommender, for cold-items. While the two algorithms can be combined in different ways, the main drawback of this method is related to the poor recommendation quality often exhibited by content-based recommenders in scenarios where it is difficult to provide a comprehensive description of the item characteristics. [16]

In case of new users, if no demographic feature is present or their quality is too poor, a common strategy is to offer them non-personalized recommendations. This means that they could be recommended simply the most popular items either globally or for his specific geographical region or language.

Profile completion

One of the available options when dealing with cold users or items is to rapidly acquire some preference data. There are various ways to do that depending on the amount of information required. These techniques are called preference elicitation strategies.[15][16]

This may be done either explicitly (by querying the user) or implicitly (by observing the user's behaviour). In both cases, the cold start problem would imply that the user has to dedicate an amount of effort using the system in its 'dumb' state – contributing to the construction of their user profile – before the system can start providing any intelligent recommendations. [17]

For example MovieLens, a web-based recommender system for movies, asks the user to rate some movies as a part of the registration.

While preference elicitation strategy are a simple and effective way to deal with new users, the additional requirements during the registration will make the process more time consuming for the user. Moreover, the quality of the obtained preferences might not be ideal as the user could rate items he/she has seen months or years ago or the provided ratings could be almost random if the user provided them without paying attention just to complete the registration quickly.

The construction of the user's profile may also be automated by integrating information from other user activities, such as browsing histories or social media platforms. If, for example, a user has been reading information about a particular music artist from a media portal, then the associated recommender system would automatically propose that artist's releases when the user visits the music store.[18]

A variation of the previous approach is to automatically assign ratings to new items, based on the ratings assigned by the community to other similar items. Item similarity would be determined according to the items' content-based characteristics.[17]

It is also possible to create initial profile of a user based on the personality characteristics of the user and use such profile to generate personalized recommendation.[19][20]

Personality characteristics of the user can be identified using a personality model such as five factor model (FFM).

Another of the possible techniques is to apply active learning (machine learning). The main goal of active learning is to guide the user in the preference elicitation process in order to ask him to rate only the items that for the recommender point of view will be the most informative ones. This is done by analysing the available data and estimating the usefulness of the data points (e.g., ratings, interactions). [21]

As an example, say that we want to build two clusters from a certain cloud of points. As soon as we have identified two points each belonging to a different cluster, which is the next most informative point? If we take a point close to one we already know we can expect that it will likely belong to the same cluster. If we choose a point which is in between the two clusters, knowing which cluster it belongs to will help us in finding where the boundary is, allowing to classify lots of other points with just a few observations.

The cold start problem is also exhibited by interface agents. Since such an agent typically learns the user's preferences implicitly by observing patterns in the user's behaviour – "watching over the shoulder" – it would take time before the agent may perform any adaptations personalised to the user. Even then, its assistance would be limited to activities which it has formerly observed the user engaging in.[22]

The cold start problem may be overcome by introducing an element of collaboration amongst agents assisting various users. This way, novel situations may be handled by requesting other agents to share what they have already learnt from their respective users.[22]

Feature mapping

In recent years more advanced strategies have been proposed, they all rely on machine learning and attempt to merge the content and collaborative information in a single model.

One example of this approaches is called attribute to feature mapping[23] which is tailored to matrix factorization algorithms.[24] The basic idea is the following. A matrix factorization model represents the user-item interactions as the product of two rectangular matrices whose content is learned using the known interactions via machine learning. Each user will be associated to a row of the first matrix and each item with a column of the second matrix. The row or column associated to a specific user or item is called latent factors.[25] When a new item is added it has no associated latent factors and the lack of interactions does not allow to learn them, as it was done with other items. If each item is associated to some features (e.g. author, year, publisher, actors) it is possible to define an embedding function, which given the item features estimates the corresponding item latent factors. The embedding function can be designed in many ways and it is trained with the data already available from warm items. The same applies for a new user, as if some information is available for them (e.g. age, nationality, gender) then his/her latent factors can be estimated via an embedding function.

Hybrid feature weighting

Another recent approach which bears similarities with feature mapping is building a hybrid content-based filtering recommender in which features, either of the items or of the users, are weighted according to the user's perception of importance. In order to identify a movie that the user could like, different attributes (e.g. which are the actors, director, country, title) will have different importance. As an example consider the James bond movie series, the main actor changed many times during the years, while some did not, like Lois Maxwell. Therefore, her presence will probably be a better identifier of that kind of movie than the presence of one of the various main actors. [26] [27]

Although various techniques exist to apply feature weighting to user or item features in recommender systems, most of them are from the information retrieval domain like tf–idf, Okapi BM25, only a few have been developed specifically for recommenders.[28]

Hybrid feature weighting techniques in particular are tailored for the recommender system domain. Some of them learn feature weight by exploiting directly the user's interactions with items, like FBSM. [27] Others rely on an intermediate collaborative model trained on warm items and attempt to learn the content feature weights which will better approximate the collaborative model.[26] [29] [30]

Many of the hybrid methods can be considered special cases of factorization machines. [31] [32]

See also

  • Collaborative filtering
  • Preference elicitation
  • Recommender system
  • Active learning (machine learning)
  • Five Factor Model

References

1. ^{{cite journal |last1=Bobadilla |first1=Jesús |last2=Ortega |first2=Fernando |last3=Hernando |first3=Antonio |last4=Bernal |first4=Jesús |title=A collaborative filtering approach to mitigate the new user cold start problem |journal=Knowledge-Based Systems |date=February 2012 |volume=26 |pages=225–238 |doi=10.1016/j.knosys.2011.07.021|url=http://oa.upm.es/15302/ }}
2. ^{{cite journal |last1=Lika |first1=Blerina |last2=Kolomvatsos |first2=Kostas |last3=Hadjiefthymiades |first3=Stathes |title=Facing the cold start problem in recommender systems |journal=Expert Systems with Applications |date=March 2014 |volume=41 |issue=4 |pages=2065–2073 |doi=10.1016/j.eswa.2013.09.005}}
3. ^{{cite journal |last1=Hou |first1=Lei |last2=Pan |first2=Xue |last3=Liu |first3=Kecheng |title=Balancing the popularity bias of object similarities for personalised recommendation |journal=The European Physical Journal B |date=7 March 2018 |volume=91 |issue=3 |doi=10.1140/epjb/e2018-80374-8}}
4. ^{{cite book |last1=Abdollahpouri |first1=Himan |title=Proceedings of the Eleventh ACM Conference on Recommender Systems - Rec Sys '17 |last2=Burke |first2=Robin |last3=Mobasher |first3=Bamshad |date=27 August 2017 |pages=42–46 |doi=10.1145/3109859.3109912 |publisher=ACM|isbn=9781450346528 }}
5. ^{{cite book |last1=Park |first1=Yoon-Joo |title=Proceedings of the 2008 ACM conference on Recommender systems - Rec Sys '08 |last2=Tuzhilin |first2=Alexander |date=23 October 2008 |pages=11–18 |doi=10.1145/1454008.1454012 |publisher=ACM|isbn=9781605580937 |citeseerx=10.1.1.421.1833 }}
6. ^{{cite book |last1=Pazzani |first1=Michael J. |last2=Billsus |first2=Daniel |title=Content-Based Recommendation Systems |journal=The Adaptive Web |volume=4321 |date=2007 |pages=325–341 |doi=10.1007/978-3-540-72079-9_10 |language=en|series=Lecture Notes in Computer Science |isbn=978-3-540-72078-2 |citeseerx=10.1.1.130.8327 }}
7. ^{{cite journal |last1=Chen |first1=Li |last2=Chen |first2=Guanliang |last3=Wang |first3=Feng |title=Recommender systems based on user reviews: the state of the art |journal=User Modeling and User-Adapted Interaction |date=22 January 2015 |volume=25 |issue=2 |pages=99–154 |doi=10.1007/s11257-015-9155-5}}
8. ^{{cite journal |last1=Bobadilla |first1=Jesús |last2=Ortega |first2=Fernando |last3=Hernando |first3=Antonio |last4=Bernal |first4=Jesús |title=A collaborative filtering approach to mitigate the new user cold start problem |journal=Knowledge-Based Systems |date=February 2012 |volume=26 |pages=225–238 |doi=10.1016/j.knosys.2011.07.021|url=http://oa.upm.es/15302/ }}
9. ^{{cite journal |last1=Rashid |first1=Al Mamunur |last2=Karypis |first2=George |last3=Riedl |first3=John |title=Learning preferences of new users in recommender systems |journal=ACM SIGKDD Explorations Newsletter |date=20 December 2008 |volume=10 |issue=2 |pages=90 |doi=10.1145/1540276.1540302}}
10. ^{{cite journal |last1=Bobadilla |first1=J. |last2=Ortega |first2=F. |last3=Hernando |first3=A. |last4=Gutiérrez |first4=A. |title=Recommender systems survey |journal=Knowledge-Based Systems |date=July 2013 |volume=46 |pages=109–132 |doi=10.1016/j.knosys.2013.03.012}}
11. ^{{cite journal |last1=Zhang |first1=Zi-Ke |last2=Liu |first2=Chuang |last3=Zhang |first3=Yi-Cheng |last4=Zhou |first4=Tao |title=Solving the cold-start problem in recommender systems with social tags |journal=EPL (Europhysics Letters) |date=1 October 2010 |volume=92 |issue=2 |pages=28002 |doi=10.1209/0295-5075/92/28002}}
12. ^{{cite journal |last1=Huang |first1=Zan |last2=Chen |first2=Hsinchun |last3=Zeng |first3=Daniel |title=Applying associative retrieval techniques to alleviate the sparsity problem in collaborative filtering |journal=ACM Transactions on Information Systems |date=1 January 2004 |volume=22 |issue=1 |pages=116–142 |doi=10.1145/963770.963775|citeseerx=10.1.1.3.1590 }}
13. ^{{cite journal |last1=Salter |first1=J. |last2=Antonopoulos |first2=N. |title=CinemaScreen Recommender Agent: Combining Collaborative and Content-Based Filtering |journal=IEEE Intelligent Systems |date=January 2006 |volume=21 |issue=1 |pages=35–41 |doi=10.1109/MIS.2006.4|url=http://epubs.surrey.ac.uk/1833/1/fulltext.pdf }}
14. ^{{cite book |last1=Burke |first1=Robin |title=Hybrid Web Recommender Systems |journal=The Adaptive Web |volume=4321 |date=2007 |pages=377–408 |doi=10.1007/978-3-540-72079-9_12 |language=en|series=Lecture Notes in Computer Science |isbn=978-3-540-72078-2 |citeseerx=10.1.1.395.8975 }}
15. ^ {{cite book|last1= Elahi|first1=Mehdi|title=E-Commerce and Web Technologies|volume=188|last2= Ricci|first2=Francesco|last3=Rubens|first3=Neil|publisher=Springer International Publishing|isbn=978-3-319-10491-1|pages=113–124|doi=10.1007/978-3-319-10491-1_12|series=Lecture Notes in Business Information Processing|year=2014}}
16. ^{{cite journal|last1= Elahi|first1=Mehdi|last2= Ricci|first2=Francesco|last3=Rubens|first3=Neil|title=A survey of active learning in collaborative filtering recommender systems|journal=Computer Science Review|volume=20|pages=29–50|date=2016|via= Elsevier|url=https://www.researchgate.net/publication/303781992|doi=10.1016/j.cosrev.2016.05.002}}
17. ^ {{cite conference |author1=Andrew I. Schein |author2=Alexandrin Popescul |author3=Lyle H. Ungar |author4=David M. Pennock | year = 2002 | title = Methods and Metrics for Cold-Start Recommendations | conference = Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002) | url = http://citeseer.ist.psu.edu/schein02methods.html | isbn = 1-58113-561-0 | publisher = ACM | location = New York City, New York | pages = 253–260 | accessdate = 2008-02-02 }}
18. ^{{cite journal | date = 2007-06-29 | title = Vendor attempts to crack 'cold start' problem in content recommendations | journal = Mobile Media | pages = 18 |url=http://www.xiam.com/press/MM062907.pdf | accessdate = 2008-02-02|archive-url=https://web.archive.org/web/20080801000000/http://www.xiam.com/press/MM062907.pdf|archive-date=2008-08-01 }}
19. ^{{cite book|last1=Tkalcic |first1=Marko|last2=Chen|first2=Li |editor1-last=Ricci|editor1-first=Francesco |editor2-last=Rokach|editor2-first=Lior |editor3-last=Shapira |editor3-first=Bracha |title=Recommender Systems Handbook |date=2016 |publisher=Springer US |isbn=978-1-4899-7637-6 |edition=2nd |chapter=Personality and Recommender Systems |chapter-url= https://rd.springer.com/chapter/10.1007/978-1-4899-7637-6_21 |url = https://rd.springer.com/book/10.1007/978-1-4899-7637-6|doi=10.1007/978-1-4899-7637-6_21}}
20. ^{{cite journal|last1=Fernández-Tobías|first1=Ignacio|last2=Braunhofer|first2=Matthias|last3=Elahi|first3=Mehdi|last4=Ricci|first4=Francesco|last5=Cantador|first5=Iván|title=Alleviating the new user problem in collaborative filtering by exploiting personality information|journal=User Modeling and User-Adapted Interaction|volume=26|issue=2–3|pages=221–255|date=2016|doi=10.1007/s11257-016-9172-z|hdl=10486/674370}}
21. ^{{cite book|last1=Rubens |first1=Neil|last2= Elahi|first2=Mehdi |last3=Sugiyama|first3=Masashi|last4=Kaplan|first4=Dain|editor1-last=Ricci |editor1-first=Francesco |editor2-last=Rokach|editor2-first=Lior |editor3-last=Shapira |editor3-first=Bracha |title=Recommender Systems Handbook |date=2016 |publisher=Springer US |isbn=978-1-4899-7637-6 |edition=2nd |chapter=Active Learning in Recommender Systems |chapter-url= https://rd.springer.com/chapter/10.1007/978-1-4899-7637-6_24 |url = https://rd.springer.com/book/10.1007/978-1-4899-7637-6|doi=10.1007/978-1-4899-7637-6_24}}
22. ^{{cite conference |author1=Yezdi Lashkari |author2=Max Metral |author3=Pattie Maes | year = 1994 | title = Collaborative Interface Agents | conference = Proceedings of the Twelfth National Conference on Artificial Intelligence | url = http://citeseer.ist.psu.edu/lashkari94collaborative.html | isbn = 0-262-61102-3 | publisher = AAAI Press | location = Seattle, Washington | pages = 444–449 | accessdate = 2008-02-02 }}
23. ^{{cite book |last1=Gantner |first1=Zeno |title=2010 IEEE International Conference on Data Mining |pages=176–185 |last2=Drumond |first2=Lucas |last3=Freudenthaler |first3=Cristoph |date=20 January 2011 |doi=10.1109/ICDM.2010.129|isbn=978-1-4244-9131-5 |citeseerx=10.1.1.187.5933 }}
24. ^{{cite journal |last1=Koren |first1=Yehuda |last2=Bell |first2=Robert |last3=Volinsky |first3=Chris |title=Matrix Factorization Techniques for Recommender Systems |journal=Computer |date=August 2009 |volume=42 |issue=8 |pages=30–37 |doi=10.1109/MC.2009.263|citeseerx=10.1.1.147.8295 }}
25. ^{{cite book |last1=Agarwal |first1=Deepak |title=Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09 |last2=Chen |first2=Bee-Chung |date=28 June 2009 |pages=19–28 |doi=10.1145/1557019.1557029 |publisher=ACM|isbn=9781605584959 }}
26. ^{{cite book |last1=Cella |first1=Leonardo |last2=Cereda |first2=Stefano |last3=Quadrana |first3=Massimo |last4=Cremonesi |first4=Paolo |title=Deriving Item Features Relevance from Past User Interactions |journal=UMAP '17 Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization |date=2017 |pages=275–279 |doi=10.1145/3079628.3079695|isbn=9781450346351 |hdl=11311/1061220 }}
27. ^{{cite book |last1=Sharma |first1=Mohit |last2=Zhou |first2=Jiayu |last3=Hu |first3=Junling |last4=Karypis |first4=George |title=Feature-based factorized Bilinear Similarity Model for Cold-Start Top-n Item Recommendation |journal=Proceedings of the 2015 SIAM International Conference on Data Mining |pages=190–198 |date=2015 |doi=10.1137/1.9781611974010.22|isbn=978-1-61197-401-0 }}
28. ^{{cite book |last1=Symeonidis |first1=Panagiotis |last2=Nanopoulos |first2=Alexandros |last3=Manolopoulos |first3=Yannis |title=Feature-Weighted User Model for Recommender Systems |journal=User Modeling 2007 |volume=4511 |date=25 July 2007 |pages=97–106 |doi=10.1007/978-3-540-73078-1_13 |language=en|series=Lecture Notes in Computer Science |isbn=978-3-540-73077-4 }}
29. ^{{cite journal |last1=Ferrari Dacrema |first1=Maurizio |last2=Gasparin |first2=Alberto |last3=Cremonesi |first3=Paolo |title=Deriving item features relevance from collaborative domain knowledge |journal=Proceedings of Knowledge-aware and Conversational Recommender Systems (KaRS) Workshop 2018 (co-located with RecSys 2018) |url=http://ceur-ws.org/Vol-2290/kars2018_paper1.pdf}}
30. ^{{cite journal |last1=Bernardis |first1=Cesare |last2=Ferrari Dacrema |first2=Maurizio |last3=Cremonesi |first3=Paolo |title=A novel graph-based model for hybrid recommendations in cold-start scenarios |journal=Proceedings of the Late-Breaking Results Track Part of the Twelfth ACM Conference on Recommender Systems |arxiv=1808.10664 |year=2018 }}
31. ^{{cite journal |last1=Rendle |first1=Steffen |title=Factorization Machines with libFM |journal=ACM Transactions on Intelligent Systems and Technology |date=1 May 2012 |volume=3 |issue=3 |pages=1–22 |doi=10.1145/2168752.2168771}}
32. ^{{cite book |last1=Rendle |first1=Steffen |title=2010 IEEE International Conference on Data Mining |pages=995–1000 |date=2010 |publisher=IEEE |isbn=9781424491315 |doi=10.1109/ICDM.2010.127 |chapter=Factorization Machines |citeseerx=10.1.1.393.8529 }}

External links

  • http://activeintelligence.org/wp-content/papercite-data/pdf/Rubens-Active-Learning-RecSysHB2010.pdf
  • http://activeintelligence.org/research/al-rs/

2 : Collective intelligence|Information systems

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/13 16:29:46