The Czech Language Main page > Czech National Corpus | ![]() |
The CNC, which is accessible to broad academic public at home and abroad, is run under a series of programmes which allow the user to search for linguistic units, be it words, word forms, part of words or collocations, and their frequency, grammatical and other characteristics. In its balanced, rather representative shape, the CNC will be released by the turn of 1999/2000 but its provisonal use is offered to anyone since 1996. It is in a concordance format that the user will get results of his search enabling him or her to study the real contextual use of words and the like. The concordances thus obtained can be furhter processed, sorted, classified etc. This makes work with language more of a fast play rather than the old-time drudgery.
Next to the contemporary CNC (100 million words and more, later on), two small corpora, that of Old Czech and Spoken Czech are being built at the same time.
The public Internet access to a small part of CNC's (some 20 million words) is open to anyone.
For access to the full CNC, you have to address the administrator (http://ucnk.ff.cuni.cz) and ask for special permission, which is granted to anyone for non-commercial purposes.
Marie Nováková, ©2001
Sentences and their structure << | top of the page | >> Spoken Czech, its character and use |