Recommender Systems

728x90

Netflix Prize
Recommendation and Collaborative Filtering
KNN-based Methods
Martrix Factorization
Recent Recommenders
Case Study

Netflix Prize

RMSE(Root Mean Square Error)

$\frac{1}{\left|R \right|}\sqrt{\sum_{(i,x)}^{}(\hat{r}_{xi}-r_{xi})^2}$

넷플릭스 원래 RMSE 인 0.9514에서 10% 더 줄여라!

Recommendation and Collaborative Filtering

Background

types of user-item matrix data

Explicit feedback : 평점, 좋아요, 등등
Implicit feedback : 클릭, 구매, 북마크 등등

Evaluating Recommendation Method

Rating prediction perspective : RMSE
Classification : Precision, Recall, F-measure(F1 score)
- top-N recommendation list contains the true items
Ranking : NDCG, MRR
- 진짜 item 중에 top-N에 든 것들이 얼마인지

Content-based Approach

사용자 기록과 비슷한 contents를 추천함

Similarity : Cosine Similarity, Pearson Correlation

$sim(u, i)=cos(\overrightarrow{w_u}, \overrightarrow{w_i}) = \frac{\overrightarrow{w_u}\cdot \overrightarrow{w_i}}{\left\| \overrightarrow{w_u}\right\|_2\times \left\| \overrightarrow{w_i}\right\|_2} = \frac{\sum_{s\in S}r_{x,s}r_{y, s}}{\sqrt{\sum_{s\in S_{xy}}r_{x,s}^2} \sqrt{\sum_{s\in S_{xy}}r_{y,s}^{2}}}$

Collaborative Filtering(CF), KNN-based Method

domian 별로 따로 적용해 줄 필요 없음

비슷한 취향을 가진 사람들끼리 묶어주기

Pearson Correlation Coefficient

$sim(x, y)=\frac{\sum_{s\in S}(r_{x,s}-\bar{r_x})(r_{y, s}-\bar{r_y})}{\sqrt{\sum_{s\in S_{xy}}(r_{x,s}-\bar{r_x})^2}\sqrt{\sum_{s\in S_{xy}}(r_{y, s}-\bar{r_y})^2}}$

비슷한 사람을 Top-n을 이용해 뽑아주기
Rate를 예측한다.
- 그냥 평균 내 버리기
- similarity를 고려하여 각각 sim을 곱한 후, sim만의 합으로 나눠 구하기
- 주로 4점을 주는 사람과 주로 2점을 주는 사람들의 영향을 받지 않기 위해 비슷한 사람의 평균을 고려해 예측해 주기

variation : 사람 말고 콘텐츠 기반 CF도 있음

Matrix Factorization

Latent Factor Models

딥러닝 때 나왔던 내용이다. 이 데이터 사이언스는 좀 구식의 기술을 다룬다고 수업 초기 때부터 말하시긴 하셨는데

이렇게 이어지는 군!

MSE를 줄이는 방향으로 학습시켜 latent feature들을 찾아낸 다음 matrix의 빈 부분들을 채워 넣는다.

여느 model이 그렇듯 overfitting 문제가 발생한다.

Regularization 진행해 줘야 함

여기에 Bias 랑 time까지 고려해서 달성했다는 이야기~

마지막 부분의 식은 범위에 맞지 않아 휙휙 넘어갔으므로 넘어간다.

728x90

저작자표시 비영리 변경금지

'데이터 사이언스' 카테고리의 다른 글

Clustering 후반부 - 2단계 (1)	2024.06.17
Clustering 후반부 - 3단계 (0)	2024.06.17
Getting to know Your Data & Data preprocessing (1)	2024.06.17
Clustering 전반부 - 3단계 (0)	2024.06.10
Clustering 전반부 - 2단계 (1)	2024.06.09

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Recommender Systems

Netflix Prize

Recommendation and Collaborative Filtering

Background

types of user-item matrix data

Evaluating Recommendation Method

Content-based Approach

Collaborative Filtering(CF), KNN-based Method

Matrix Factorization

Latent Factor Models

'데이터 사이언스' 카테고리의 다른 글

Netflix Prize

Recommendation and Collaborative Filtering

Background

types of user-item matrix data

Evaluating Recommendation Method

Content-based Approach

Collaborative Filtering(CF), KNN-based Method

Matrix Factorization

Latent Factor Models

'데이터 사이언스' 카테고리의 다른 글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

types of user-item matrix data

Evaluating Recommendation Method

Latent Factor Models

'데이터 사이언스' 카테고리의 다른 글

types of user-item matrix data

Evaluating Recommendation Method

Latent Factor Models

'데이터 사이언스' 카테고리의 다른 글

개인정보

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역