More Related Content
Similar to AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
Similar to AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018 (20)
More from Amazon Web Services Korea
More from Amazon Web Services Korea (20)
AWS를 활용한 상품 추천 서비스 구축::김태현:: AWS Summit Seoul 2018
- 1. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
김태현
Solutions Architect / Amazon Web Services
AWS를 활용한
상품 추천 서비스 구축
- 2. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천이란?
추천 알고리즘
유사도 알고리즘
추천 시스템 아키텍쳐
성능평가
- 3. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
About Me
- 5. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천이란?
- 6. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon - 상품 추천
- 7. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Netflix - 영화 추천
- 8. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 서비스는 왜 필요할까?
- 9. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
• 아마존 전체 매출의 35%는 추천에서 발생
• 넷플릭스의 75% 사용자가 추천을 통해 영화를 선택
https://www.mckinsey.com/industries/retail/our-
insights/how-retailers-can-keep-up-with-consumers
- 10. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon을 좀 더 살펴보면
- 11. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon - Home
- 12. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon - Product
- 13. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon - Product
- 14. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Amazon - Product
- 15. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 알고리즘
- 16. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 알고리즘
• CF (Collaborative Filtering)
• User-based
• Item-based
• CBF (Contents Based Filtering)
• Text
• Image
• AR (Association Rule)
- 17. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CF (Collaborative Filtering)
- 18. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
User-based filtering
- 19. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Item-based filtering
- 20. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Item-based filtering
- 21. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CF를 구현하는 방법
1 2 3 4
2 4 5
- 22. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CF를 구현하는 방법
1 2 3 4 5
1 1 1
1
1
1 10
0
0
- 23. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CF를 구현하는 방법
1
2
3
4
5
- 24. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
규모(Scalability)의 문제
?
- 25. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Pre-Clustering
- 26. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
MinHash
Hash
Function
e883ba0a24d01f
- 27. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
MinHash
Hash
Function
1
Hash
Function
2
1 0 0 1 1
0 1 1 1 0
- 28. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Pre-Clustering
- 29. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Pre-Clustering
Calculation Time
Cluster Size
- 30. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CF 단점
• Cold Start
• 신규 상품 추천 X
• 사용자가 보지않는 상품 추천 X
• 해결책
• CBF (Contents Based Filtering)
- 31. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF (Contents Based Filtering)
- 32. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF (Contents Based Filtering)
• Contents
• Text
• Image
- 33. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF – Word2Vec for Text
- 34. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF – Word2Vec for Text
- 35. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF – Deep Learning for Image
- 36. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
CBF – Deep Learning for Image
- 37. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Hybrid
- 38. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Hybrid
• CF (Collaborative Filtering)
• 효과 , 커버리지
• CBF (Contents Based Filtering)
• 커버리지
• CF + CBF
• Main 알고리즘은 CF
• CF 추천 결과가 모자란 경우 CBF로 보완
• 패션 혹은 가구와 같은 특정 카테고리의 경우 CBF 효과가
좋을 수 있음
- 39. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
AR (Association Rule)
- 40. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
AR (Association Rule)
- 41. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
AR 단점
• 커버리지
• 정확하게 같이 구매한 상품만을 대상으로 하기 때문
• 해결책
• AR(정확도 ) + CF(커버리지 )
• 구매로그 기반의 CF를 사용하면 AR과 같은 효과를 얻을 수 있음.
- 42. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
자주 함께 구매하는 상품
- 43. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
유사도 알고리즘
- 44. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
유사도 알고리즘
• Jaccard
• Cosine
• ETC
• Euclidean
• Manhattan
• Pearson
• Tanimoto
- 45. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Jaccard
- 46. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Cosine
X
Y
Z
A
B
Cosine
- 47. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 시스템 아키텍쳐
- 48. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 시스템 아키텍쳐
EMRS3 DynamoDB
ElastiCache
Glue
LambdaAPI
Gateway
Kinesis
Firehose
- 49. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
최근 본 상품
- 50. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 시스템 아키텍쳐
ElastiCache
LambdaAPI
Gateway
- 51. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
최근 본 상품
Key Items Score
User1
Item1 20180101000000
item2 20180201000000
item3 20180301000000
User2
item4 20180101000000
item5 20180201000000
item6 20180301000000
TTL 과 Max Item 관리 가능
Redis Sorted Set
- 52. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
데이터 레이크
- 53. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
데이터 레이크
S3
Glue
LambdaAPI
Gateway
Kinesis
Firehose
- 54. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
데이터 레이크
• 하나의 중앙 저장소에 모든 데이터를 저장하고 분석
• 데이터 레이크는 S3에 데이터를 저장하는 것으로 시작
• Glue 데이터 카탈로그는 데이터에 대한 단일 뷰를 제공
• 데이터 레이크 성능 향상 팁
• 작은 파일 통합(512MB ~ 1GB)
• 컬럼 포맷 사용(Parquet, ORC)
• 압축(Snappy)
• 파티션
- 55. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
상품 추천
- 56. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 시스템 아키텍쳐
EMRS3 DynamoDB
Glue
LambdaAPI
Gateway
Kinesis
Firehose
- 57. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
상품 추천을 좀 더 쉽게
- 58. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
추천 시스템 아키텍쳐
S3
Glue
LambdaAPI
Gateway
Kinesis
Firehose
SageMaker
- 59. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
SageMaker
Amazon SageMaker
데이터 과학자와 개발자들이 머신러닝 기반의 모델을 빠르고
쉽게 만들도록 해주는 완전 관리형 플랫폼 서비스
- 60. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
SageMaker
Amazon SageMaker
1 2 3 4
I I I I
Notebook Instances Algorithms ML Training Service ML Hosting Service
- 61. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
SageMaker
Problem Algorithm Learning Typ
Discrete Classification,
Regression
Linear Learner Supervised
XGBoost Algorithm Supervised
Discrete Recommendations Factorization Machines Supervised
Image Classification Image Classification Algorithm Supervised, CNN
Neural Machine Translation Sequence to Sequence Supervised, seq2seq
Time-series Prediction DeepAR Supervised, RNN
Discrete Groupings K-Means Algorithm Unsupervised
Dimensionality Reduction PCA (Principal Component Analysis) Unsupervised
Topic Determination Latent Dirichlet Allocation (LDA) Unsupervised
Neural Topic Model (NTM) Unsupervised,
Neural Network Based
- 62. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
성능평가
- 63. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
성능평가
• AB Test
• On line
• CTR (Click Through Ratio)
• CVR (Conversion Ratio)
• Off line
• RMSE (Root Mean Squared Error)
• MAB (Multi Armed Bandit)
- 64. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
정리
• 데이터와 서비스에 대한 이해
• 머신러닝/딥러닝/통계 지식 필요
• 데이터 레이크 구축 필요
• 추천은 UI부터 추천 알고리즘까지 유기적으로 연결
• Offline과 Online 검증 및 테스트
- 65. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
관련 참고 자료
• Amazon Kinesis
• https://aws.amazon.com/ko/kinesis
• Amazon S3
• https://aws.amazon.com/ko/s3
• AWS Glue
• https://aws.amazon.com/ko/glue
• Amazon SageMaker
• https://aws.amazon.com/ko/sagemaker
- 66. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
AWS Summit 모바일 앱과 QR코드를
통해 강연 평가 및 설문 조사에 참여해
주시기 바랍니다.
내년 Summit을 만들 여러분의 소중한
의견 부탁 드립니다.
#AWSSummit 해시태그로 소셜 미디어에 여러분의 행사
소감을 올려주세요.
발표 자료 및 녹화 동영상은 AWS Korea 공식 소셜 채널로
공유될 예정입니다.
여러분의 피드백을 기다립니다!