Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

© 2017 IEEE. Named Entity Recognition (NER) is an important natural language processing (NLP) tool for information extraction and retrieval from unstructured texts such as newspapers, blogs and emails. NER involves processing unstructured text for classification of words or expressions into relevant categories. In literature, NER has been developed for various languages but limited work has been conducted to develop NER for Persian text. This is due to limited resources (such as corpus, lexicons etc.) and tools for Persian named entities. In this paper, a novel scalable system for Persian Named Entity Recognition (PNER) is presented. The proposed PNER can recognize and extract three most important named entities in Persian script: The person name, location and date. The proposed PNER has been developed by combining a grammatical rule-based approach with machine learning. The proposed framework has integrated dictionaries of Persian named entities, Persian grammar rules and a Support Vector Machine (SVM). The performance evaluation of PNER in terms of precision, recall and f-measure has achieved comparable results with the state-of-the-art NER frameworks in other languages.

Original publication

DOI

10.1109/ICCI-CC.2017.8109733

Type

Conference paper

Publication Date

14/11/2017

Pages

79 - 83