top of page
Dataset

Dataset I for General Compatibility Modeling

Due to the fact that most of existing released datasets are collected from wild street photos and thus inevitably involves clothing parsing technique, which still remains a great challenge in computer vision domain. In addition, these datasets lack the rich contextual metadata of each fashion item, which makes it intractable to fully model the fashion items. Therefore, to guarantee the evaluation quality and facilitate the experiment conduction, we constructed Dataset I by crawling outfits created by fashion experts on Polyvore. Considering that certain improper outfits can be accidentally created by users on Polyvore, we also set a threshold z=50 with respect to the number of "likes for each outfit to ensure the quality of the positive fashion pairs. Finally, we obtained 20,726 outfits with 14,871 tops and 13,663  bottoms. For each fashion item, we particularly collected its visual image, categories and title description. We list the attribute and its values extracted from the categories and title description in the right chart.

fashionVC.png
att.png

Dataset II for Personalized Compatibility Modeling 

itemexample.png

Considering that most of the existing publicly available datasets lack the user context, which makes it intractable to tackle the personalized clothing matching problem. It is worth noting that although the dataset Amazon contains the valuable user contexts but it focuses more on the item recommendation based on the user preference and hence lacks the ground truth regarding the coordination among fashion items. Moreover, the dataset used in "Collaborative fashion recommendation: a functional tensor factorization approach" contains only 150 users, which hinders the practical evaluation. Therefore, to bridge this gap, we created a new large dataset for personalized clothing matching. In particular, we crawled our data from the popular fashion web service IQON (www.iqon.jp), where users are encouraged to create outfits by coordinating fashion items from complementary categories (e.g., tops, bottoms, shoes and accessaries).

Dataset III for Personalized Wardrobe Creation

We noticed that the user purchase history, especially the size of purchased fashion items, involving specific body measurements (such as the hip girth and waist girth) conveys more reliable cues of the user body shape. Inspired by this, we constructed our own dataset, named Dataset III, by collecting user purchase histories from Amazon. In particular, we first collected a set of popular fashion items from Amazon. After that, based on item comments, we tracked a lot of amazon users. We crawled their recent historical purchase records (limited by 100 records), and selected the fashion items from them. In order to guarantee the quality of the dataset for PCW creation, we screened out users with less than 6 historical purchase records, and then obtained 116,528 user-item records involving 11,784 users and 75,695 fashion items. Each item comprises its image, title and category metadata. Both purchase sizes and ratings are available for each user-item record. 

dataset.jpg
bottom of page