• Home
  • Browse
    • Current Issue
    • By Issue
    • By Author
    • By Subject
    • Author Index
    • Keyword Index
  • Journal Info
    • About Journal
    • Aims and Scope
    • Editorial Board
    • Publication Ethics
    • Peer Review Process
  • Guide for Authors
  • Submit Manuscript
  • Contact Us
 
  • Login
  • Register
Home Articles List Article Information
  • Save Records
  • |
  • Printable Version
  • |
  • Recommend
  • |
  • How to cite Export to
    RIS EndNote BibTeX APA MLA Harvard Vancouver
  • |
  • Share Share
    CiteULike Mendeley Facebook Google LinkedIn Twitter
The Egyptian Journal of Language Engineering
arrow Articles in Press
arrow Current Issue
Journal Archive
Volume Volume 11 (2024)
Volume Volume 10 (2023)
Volume Volume 9 (2022)
Volume Volume 8 (2021)
Volume Volume 7 (2020)
Volume Volume 6 (2019)
Volume Volume 5 (2018)
Volume Volume 4 (2017)
Volume Volume 3 (2016)
Issue Issue 2
Issue Issue 1
Volume Volume 2 (2015)
Volume Volume 1 (2014)
Salama, H., Alansary, S. (2016). Building a POS-Annotated Corpus For Egyptian Children. The Egyptian Journal of Language Engineering, 3(1), 12-23. doi: 10.21608/ejle.2016.60164
Heba Salama; Sameh Alansary. "Building a POS-Annotated Corpus For Egyptian Children". The Egyptian Journal of Language Engineering, 3, 1, 2016, 12-23. doi: 10.21608/ejle.2016.60164
Salama, H., Alansary, S. (2016). 'Building a POS-Annotated Corpus For Egyptian Children', The Egyptian Journal of Language Engineering, 3(1), pp. 12-23. doi: 10.21608/ejle.2016.60164
Salama, H., Alansary, S. Building a POS-Annotated Corpus For Egyptian Children. The Egyptian Journal of Language Engineering, 2016; 3(1): 12-23. doi: 10.21608/ejle.2016.60164

Building a POS-Annotated Corpus For Egyptian Children

Article 2, Volume 3, Issue 1, April 2016, Page 12-23  XML PDF (1.61 MB)
Document Type: Original Article
DOI: 10.21608/ejle.2016.60164
View on SCiNiTO View on SCiNiTO
Authors
Heba Salama email 1; Sameh Alansary2
1Phonetics and Linguistics Department, Faculty of Arts, Alexandria University
2Faculy of Literature, Alexandria University
Abstract
In this paper, we present an attempt at developing a POS annotated corpus for Egyptian children.Linguistic
annotation of the corpora provides researchers with better means for exploring the development of grammatical constructions and their usage.This is an initial annotated corpus for Egyptian children. It implements part of speech tag (POS) especially a morphologically annotated corpus of spoken Arabic child language.POS are made in "%mor" 'morphology' tiers manually. Coding language transcripts for computer analysis is a daunting task. It approximately took 170 hours, and thus manual annotation focused on a particular child.The POS coding process started with a purely manually annotation of 2701words. 1380 words annotated for an adultand 1321 annotated words for the child was handled. Annotated child language proved to be challenging, and time consuming task.The MOR grammar exists in many languages, such as English, French, German, Japanese, Cantonese, Hebrew, and they are generated automatically, the CLAN has the automatic coding system "MOR program". In Egyptian Arabic, this is not applied for two reasons. First, there is no previous Egyptian Arabic work done on a constructing system for such a representation. Second, morphology of Egyptian Arabic is very rich and different from other languages. Thus, their rules cannot be applied to Arabic. In the two Arabic studies of Qatari and Emirati languages, semiautomatic and mini automatic MOR is used.Finally,certain applications of linguistic analysis commands are provided by using CLAN software. The analyses include frequency counts, word searches, co-occurrence analyses; MLU (mean length of utterance) counts and analyzes specified pairs of utterances. Transcript data provide some morphological analysis, such as mean length of utterance (MLU) counts, lexical analysis, such as frequency (FREQ) count, syntactic analysis, such as searching the data for specified combinations of words or complex string patterns (COMBO) count, as well as the discourse and interactional analysis, such as analyzes specified pairs of utterances (CHIP) count.
Keywords
POS annotated corpus; CHILDES database
Statistics
Article View: 268
PDF Download: 506
Home | Glossary | News | Aims and Scope | Sitemap
Top Top

Journal Management System. Designed by NotionWave.