Dissertation Title
Intelligent and Multilevel Patent Classification
Abstract
Many thousands of patent applications arrive at patent offices around the world every day and millions of patent professionals and researchers undertake Innovation related tasks (e.g. prior-art, patentability, technology landscape, infringement). An important task when a patent application is submitted is the assignment of one or more classification codes from a complex and hierarchical patent classification scheme. Correct classification will allow the patent application to be routed to an appropriate patent examiner who is knowledgeable of the specific technical field and (if the application proceeds) the subsequent easy search for the described invention. This task is typically undertaken by patent professionals, but due to the large number of applications and the potential complexity of an invention, they are usually overwhelmed. Therefore, there is a need to support this manual task or even to fully automate it, hopefully with an accuracy close to patent professionals.
Generally, patent classification is a single- or a multi-label text classification problem focusing on patent documents that are long technical documents having a quite distinctive language and structure. This problem brings together important research fields, such as Information Retrieval (IR), Natural Language Processing NLP, and Machine Learning and Deep Learning (ML/DL), and a large number of challenges exist that should be addressed.
The main objective of this PhD thesis is the research and development of intelligent methods for automated patent classification at multiple levels of a hierarchical classification scheme. It aims 1) to research and develop novel methods for representation and classification with respect to the automated patent classification task; and 2) to devise a methodological framework so that the proposed methods can: a) be evaluated with system-oriented experiments and tested/validated with user-centered studies, b) support patent retrieval tasks (e.g., prior art search, patentability, technology landscape) and c) be adapted and transferred to other text classification problems such as automatic detection of text genre, sentiment analysis, and fake news detection.