AN ANNOTATED CORPUS WITH SUPPORT VERB CONSTRUCTIONS IN PORTUGUESE
DOI:
https://doi.org/10.22409/gragoata.v20i38.33307Keywords:
support verb, predicative noun. Lexicon Grammar, corpus annotation.Abstract
The support verb constructions (SVC) are a type of nominal construction, where the core predicate is the noun, called 'predicative noun' (Npred), which is assisted by a verb, called 'support verb' (Vsup). The Lexicon‑Grammar theoretical and methodological framework was adopted, in this paper, for the linguistic description and formalization of SVC in Portuguese. Considering the syntactic and semantic differences between SVC and other types of constructions, the purpose of this paper is to present the methodology and results of creating a corpus annotated with Vsup and Npred. A list with 4,668 SVC was built, considering 45 variants of Vsup and around 3,200 different Npred. Based on this list, we extracted 121,198 sentences from PLN.Br full corpus, from which 2,646 sentences have been manually annotated. This sample may constitute a reference corpus for the processing of SVC and used as a golden standard for evaluating the automatic tasks of identification, extraction or classification of SVC, as well as for other Natural Language Processing (NLP) applications.
Downloads
Downloads
Published
How to Cite
Issue
Section
License
Authors who publish in Gragoatá agree to the following terms:
The authors retain the rights and give the journal the right to the first publication, simultaneously subject to a Creative Commons license CC-BY-NC 4.0, which allows sharing by third parties with due mention to the author and the first publication by Gragoatá.
Authors may enter into additional and separate contractual arrangements for the non-exclusive distribution of the published version of the work (for example, posting it in an institutional repository or publishing it in a book), with recognition of its initial publication in Gragoatá.
Gragoatá is licensed under a Creative Commons - Attribution-NonCommercial 4.0 International.