Senior Staff Software Engineer
Rui Niu is a senior member of technical staff engineer in Stubhub, he has over 18 years of experience in software engineering and software architecture and has been focusing on search for the last five years. At StubHub, he is working on building search infrastructure and search relevancy improvement, with expertise in managing and scaling search infrastructure to millions of queries per day.
Rui Niu is speaking at the following session/s
Bootstrapping LETOR for suggestions
Search suggestions are the most frequently used functionality at StubHub with 12 Million+ average hits on a normal day. At StubHub, expectations for Search suggestions change from day to day. It is very challenging to use human ranked data set since data ranked on a day is not relevant after a few days. In this session, an approach to bootstrap machine learning based ranking for the most heavily used API is discussed. Baseline method uses sales as criteria for ranking. A logistic regression model is introduced where the weights were determined heuristically. This model performs better than the baseline model. The click-through data obtained from this model is used to train a more sophisticated neural net model to gain higher accuracy. A two-pass system that relies on Solr to obtain a candidate set of documents in the first pass and re-ranks the documents based on finer parameters in the second pass is being used.
Learn how to get started with LETOR for their existing system. Obtaining good training data is very important for any ML system and is the bottleneck for many organization to get started. We encourage and guide them to bootstrap ML-based ranking system with minimal efforts and risks.
Mid level to senior Software Engineers, Engineering Managers, Data Scientists, Product Managers who want to introduce LETOR in their system. It would explain an intermediate, easy and low risk state which then eventually be transformed into full fledged LETOR using widely popular XGBoost or Neural network models. Some basic knowledge of Information retrieval and Machine Learning is required.