如果要使用基本NP,即没有协调,介词短语或相对从句的NP,则可以在Doc和Span对象上使用noun_chunks迭代器:
>>> from spacy.en import English>>> nlp = English()>>> doc = nlp(u'The cat and the dog sleep in the basket near the door.')>>> for np in doc.noun_chunks:>>> np.textu'The cat'u'the dog'u'the basket'u'the door'
如果您需要其他内容,最好的方法是遍历句子中的单词并考虑句法上下文,以确定该单词是否支配您想要的短语类型。如果是这样,则产生其子树:
from spacy.symbols import *np_labels = set([nsubj, nsubjpass, dobj, iobj, pobj]) # Probably others toodef iter_nps(doc): for word in doc: if word.dep in np_labels: yield word.subtree



