Por favor, use este identificador para citar o enlazar este ítem: http://repositoriodigital.ipn.mx/handle/123456789/16613
Registro completo de metadatos
Campo DC Valor Lengua/Idioma
dc.contributor.authorDubey, Ajay-
dc.contributor.authorVarma, Vasudeva-
dc.date.accessioned2013-08-06T17:28:03Z-
dc.date.available2013-08-06T17:28:03Z-
dc.date.issued2013-06-07-
dc.identifier.citationRevista Computación y Sistemas; Vol. 17 No.2es
dc.identifier.issn1405-5546-
dc.identifier.urihttp://www.repositoriodigital.ipn.mx/handle/123456789/16613-
dc.description.abstractAbstract. Building bilingual dictionaries from Wikipedia has been extensively studied in the area of computation linguistics. These dictionaries play a crucial role in Natural Language Processing(NLP) applications like Cross-Lingual Information Retrieval, Machine Translation and Named Entity Recognition. To build these dictionaries, most of the existing approaches use information present in Wikipedia titles, info-boxes and categories. Interestingly, not many use the structural properties of a document like sections, subsections, etc. In this work we exploit the structural properties of documents to build a bilingual English-Hindi dictionary. The main intuition behind this approach is that documents in different languages discussing the same topic are likely to have similar structural elements. Though we present our experiments only for Hindi, our approach is language independent and can be easily extended to other languages. The major contribution of our work is that the dictionary contains translation and transliteration of words which include Named Entities to a large extent. We evaluate our dictionary using manually computed precision. We generated a massive list of 72k tokens using our approach with 0.75 precision.es
dc.description.sponsorshipInstituto Politécnico Nacional - Centro de Investigación en Computación (CIC).es
dc.language.isoen_USes
dc.publisherRevista Computación y Sistemas; Vol. 17 No.2es
dc.relation.ispartofseriesRevista Computación y Sistemas;Vol. 17 No.2-
dc.subjectKeywords. Bilingual dictionary, comparable corpora, structural elements.es
dc.titleGeneration of Bilingual Dictionaries using Structural Propertieses
dc.title.alternativeGeneración de diccionarios bilingües usando las propiedades estructuraleses
dc.typeArticlees
dc.description.especialidadInvestigación en Computaciónes
dc.description.tipoPDFes
Aparece en las colecciones: Revistas

Ficheros en este ítem:
Fichero Descripción Tamaño Formato  
161_ART 5.pdf445.48 kBAdobe PDFVisualizar/Abrir


Los ítems de DSpace están protegidos por copyright, con todos los derechos reservados, a menos que se indique lo contrario.