NLTK的全称是natural language toolkit,是一套基于python的自然语言处理工具集。
安装完NLTK模块之后。安装nltk_data由于网络原因安装一直失败,只能下载模块安装了,由于网络原因手动下载也经常失败所以,可以到此网盘下载现成的
下载地址:http://www.3qphp.com/down/110/107.html
下载之后安装:
解压压缩包:nltk_data-gh-pages.zip
拷贝packages文件夹到 D:\python\packages 然后修改packages文件夹名称为D:\python\nltk_data
然后创建系统环境变量
安装完成
测试
python代码:
__author__ = .book *
执行结果:
D:\python\python.exe D:/phpstudy/WWW/spiderMasg/python/spider/nltkhandle.py
*** Introductory Examples for the NLTK Book ***
Loading text1, ..., text9 and sent1, ..., sent9
Type the name of the text or sentence to view it.
Type: 'texts()' or 'sents()' to list the materials.
text1: Moby Dick by Herman Melville 1851
text2: Sense and Sensibility by Jane Austen 1811
text3: The Book of Genesis
text4: Inaugural Address Corpus
text5: Chat Corpus
text6: Monty Python and the Holy Grail
text7: Wall Street Journal
text8: Personals Corpus
text9: The Man Who Was Thursday by G . K . Chesterton 1908
说明安装成功了
安装成功了但是nltk_data中没有中文语料库,所以你要通过pip安装一个中文分词模块叫jieba</a>
转载请注明:谷谷点程序 » 手动下载nltk_data,jieba中文语料库挖掘