英語語法網 英語詞匯網 高考英語網 中考英語網
      精心組稿 精巧編排 精彩紛呈 全心打造英語第一品牌!
      加入收藏
      網站地圖
      購點說明
      首    頁 | 語法新聞 | 名詞用法 | 代詞用法 | 冠詞用法 | 數詞用法 | 介詞用法 | 連詞用法 | 形容詞用法 | 副詞用法 | 比較等級 | 動詞用法 | 連系動詞 | 情態動詞 | 動詞時態 | 被動語態 | 虛擬語氣 | 非謂語動詞 | 疑問句 | 祈使句 | 感嘆句 | 否定句 | 倒裝句 | 強調句 | there be存在句 | 省略句 | 獨立主格 | 主謂一致 | 狀語從句 | 定語從句 | 名詞性從句 | it用法 | 語法練習 | 語法考試 | 語法綜合 | 句子成分 | 語法連載 | 語法著作 | 英語語料庫 | 語法與翻譯 | 雙語閱讀 | 語法與慣用法 | 語法與寫作 | 期刊精選 | 語法觀點 | 語法挑刺 | 下載中心 | 趣味英語 | 會員之家 | 專家顧問 | 百家講壇 | 答疑中心
      您現在的位置: 首頁 > 英語語法 > 英語語料庫 >
      英國國家語料庫(BNC)介紹
      作者:admin    文章來源:本站原創    點擊數:    更新時間:2011-11-16    
              ★★★ 【字體:
      說明:引用此文請注明出處,并務請保留后面的有效鏈接地址,謝謝!


      英國國家語料庫(BNC)介紹

       

      (歡迎收藏本頁)

       

      BNC=The British National Corpus 英國國家語料庫

      http://www.natcorp.ox.ac.uk/BNC網址,點擊進入) 

      http://corpus.byu.edu/bnc/ BNC網址,點擊進入)

       

      英語國家語料庫(British National Corpus,簡稱BNC)是目前網絡可直接使用的最大的語料庫,它是英國牛津出版社﹑朗文出版公司﹑錢伯斯—哈洛普出版公司﹑牛津大學計算機服務中心、蘭卡斯特大學英語計算機中心以及大英圖書館等聯合開發建立的大型語料庫,于1994年完成。

      英國國家語料庫(BNC)是一個以來源廣泛的書面語言和口語為樣本,收錄了1億字的電子資源,用以呈現20世紀后期以來的英式英語,涉及口語和書面英語。該語料庫書面語與口語并存,詞容量超過一億,由4124篇代表廣泛的現代英式英語文本構成。其中書面語占90%,口語占10%BNC最新版是BNC XML 2007。它采用國際通用標準化標注體系SGML,使用三級賦碼標注,使標注錯誤率由3%減少到1%。在應用方面,該語料庫既可用其配套的SARA檢索軟件,也可支持多種通用檢索軟件,并可直接進行在線檢索。

      What is the BNC?

      The British National Corpus (BNC) is a 100 million word collection of samples of written and spoken language from a wide range of sources, designed to represent a wide cross-section of British English from the later part of the 20th century, both spoken and written. The latest edition is the BNC XML Edition, released in 2007.

      The written part of the BNC (90%) includes, for example, extracts from regional and national newspapers, specialist periodicals and journals for all ages and interests, academic books and popular fiction, published and unpublished letters and memoranda, school and university essays, among many other kinds of text. The spoken part (10%) consists of orthographic transcriptions of unscripted informal conversations (recorded by volunteers selected from different age, region and social classes in a demographically balanced way) and spoken language collected in different contexts, ranging from formal business or government meetings to radio shows and phone-ins.

      The corpus is encoded according to the Guidelines of the Text Encoding Initiative (TEI) to represent both the output from CLAWS (automatic part-of-speech tagger) and a variety of other structural properties of texts (e.g. headings, paragraphs, lists etc.). Full classification, contextual and bibliographic information is also included with each text in the form of a TEI-conformant header.

      Work on building the corpus began in 1991, and was completed in 1994. No new texts have been added after the completion of the project but the corpus was slightly revised prior to the release of the second edition BNC World (2001) and the third edition BNC XML Edition (2007). Since the completion of the project, two sub-corpora with material from the BNC have been released separately: the BNC Sampler (a general collection of one million written words, one million spoken) and the BNC Baby (four one-million word samples from four different genres).

       

      Full technical documentation covering all aspects of the BNC including its design, markup, and contents are provided by the Reference Guide for the British National Corpus (XML Edition). For earlier versions of the Reference Guide and other documentation, see the BNC Archive page.

      What sort of corpus is the BNC?

      Monolingual: It deals with modern British English, not other languages used in Britain. However non-British English and foreign language words do occur in the corpus.

      Synchronic: It covers British English of the late twentieth century, rather than the historical development which produced it.

      General: It includes many different styles and varieties, and is not limited to any particular subject field, genre or register. In particular, it contains examples of both spoken and written language.

      Sample: For written sources, samples of 45,000 words are taken from various parts of single-author texts. Shorter texts up to a maximum of 45,000 words, or multi-author texts such as magazines and newspapers, are included in full. Sampling allows for a wider coverage of texts within the 100 million limit, and avoids over-representing idiosyncratic texts.

       

      [1] [2] 下一頁

      引用地址:
      文章錄入:admin    責任編輯:admin 
    1. 上一篇文章:

    2. 下一篇文章:
    3. 發表評論】【加入收藏】【告訴好友】【打印此文】【關閉窗口
      網友評論:(只顯示最新10條。評論內容只代表網友觀點,與本站立場無關!)
      主站蜘蛛池模板: 国产一卡二卡四卡免费| 中文字幕无码不卡免费视频| 粗大的内捧猛烈进出视频| 国产成人麻豆亚洲综合无码精品| 一区二区三区四区欧美| 日韩精品专区在线影院重磅| 亚洲网站免费看| 老司机亚洲精品影院在线| 国产热re99久久6国产精品| a在线观看免费视频| 无遮无挡非常色的视频免费| 亚洲国产欧美91| 猫咪免费观看人成网站在线| 国产一区二区三区手机在线观看| 老司机激情影院| 在线观看无码av网站永久免费| 中文字幕在线免费看线人| 明星女友开挂吧电视剧在线观看 | www国产亚洲精品久久久日本| 日本黄色片下载| 亚洲国产精品第一区二区| 男生和女生一起差差的视频30分| 国产乱妇无码大黄aa片| 亚洲精品视频在线观看你懂的 | 亚洲欧美成aⅴ人在线观看| 精品无码一区二区三区| 国产亚洲精品第一综合| 亚洲情综合五月天| 色妺妺在线视频| 国产精品久久久久久久久电影网| bt自拍另类综合欧美| 成年男女免费视频网站| 久久精品这里有| 欧美人善交videosg| 亚洲综合色丁香麻豆| 精品国产三级a∨在线观看| 国产亚洲欧美日韩亚洲中文色| 亚洲欧美18v中文字幕高清| 国内精品一区二区三区最新 | 国产成人麻豆亚洲综合无码精品 | 成人深夜福利视频|