Data Generation Strategy

Created
TagsTraining

When we first start a new problem that requires machine learning, especially when supervised learning is more suitable, we have to answer the question of, "How do we get labels data?"

LinkedIn feed ranking: We can generate label data by order feeds chronologically first to collect data.

Facebook place recommendation: We can use places people like first and then use them as positive labels. For negative labels, we can either sample all other places as negative samples or pick all places that users saw but didn’t like as negative samples.