- Pinterest Engineering Medium !!
- Structure of Pinterest !
- Pinterest business page
- How does Pinterest use ML
- Pinterest’s Visual Lens: CV exploring taste !!!
what is it?
- social media platform and visual discovery engine
- it helps people find and save ideas and inspiration
- also has a focus on fashion and e-commerce
key questions
- What interests shall we recommend to a new user?
- How to generate an engaging home-feed?
- How do pins relate to each other?
- What interests does a pin belong to?
Pinterest is trying to do 3 things with Computer Vision:
- understand aesthetic qualities of a product or a service to do better recommendations
- look inside an image with multiple items and search for similar results using any of those items
- make the camera the tool you use to query the world
key features
pins
- media that users can save into differing boards
- boards can have subgroups
- users can pin content from other sources to Pinterest
recommendation system
-
Machine Learning is used to learn interests of users
-
AI is used to categorize and sort uploaded photos, making it easier for images to be ranked
- representation learning model compares photos and groups based on similar qualities
- visual patterns
- metadata and user-labeled data
- By saving pins into boards, users actually create a labeled dataset describing their preferences
- non-visually-similar images can also be linked based on shared boards, captions, etc
- the ranking model considers
- domain quality
- how well photos from a website perform on the app
- pin quality
- how well a photo performs based on user interactions + engagement (with pinner’s own followers, with larger audience, etc.)
- pinner quality
- engagement rate, quality of posted pins, how much engagement a user provides to others
- topic relevance
- analyze user preferences and content embeddings of pins, to rank most relevant content to display
- domain quality
- representation learning model compares photos and groups based on similar qualities
-
PinSage actively gathers visually related images into graphs and uses them to generate content recommendations
-
Pinnability measures how likely certain users are to interact with Pins, ensuring relevance of recommendations
visual lens
-
search for items or ideas within images with Computer Vision
-
automatic Object Detection to find all objects in an image in real-time
-
query understanding layer
- compute visual features
- objects
- salient colors
- lighting
- image quality conditions
- compute semantic features such as annotations and category
- compute visual features
-
blender - blend results from multiple sources
- visual search returns visually similar results
- enables object-to-object matching
- seamless with auto object detection
- challenge: collecting labeled bounding boxes for regions of interest aggregated image crops (visual searches) to learn which objects Pinners are interested in
- aggregate annotations of visually similar results to each crop, and assigns a weak label across hundreds of object categories
- uses Faster R-CNN (convolutional neural networks (CNN))
- identifies regions likely to contain objects of interest by running a CNN to produce a feature map
- for each location on the feature map, network considers a fixed set of regions, and uses binary softmax classifier to determine how likely it is to have an object of interest
- for each candidate region, performs spatial pooling to produce a feature vector of fixed size
- feature is inputted to detection network, which uses softmax to identify region as either bg or subject
- more adjustments to boundaries to refine detection
- non-maximum suppression to filter duplicate detections
- identifies regions likely to contain objects of interest by running a CNN to produce a feature map
- object search returns scenes with visually similar objects
- traditional visual search systems treat whole image as a unit
- Pinterest wanted to understand images at a more fine-grained level
- it knows both he location and semantic meaning of billions of objects in its image corpus!
- objects are the unit - given an input image, finds the most visually similar objects in billions of images, maps those to the og image, and returns scenes containing similar objects
- visual search returns visually similar results
- image search returns personalized text search results that are semantically relevant to the input image
- the blender dynamically adjusts blending ratios and sources based on info from the query understanding layer
data architecture
- Apache Kafka - process live data feeds
- Redshift - manage and analyze data
- Hadoop - process massive data
- Storm - perform computations and process data in real-time
- HBase - backend storage
analytics
- service providing stats on website traffic
- engagement, impressions, pin clicks, outbound clicks, saves
- data can be used to investigate product popularity based on time of day, trends, etc.
- marketing agencies can utilize this data to optimize for selling potential of products